ihop-reach

View the Project on GitHub cannin/ihop-reach

17 July 2019

Week Seven | Data Generation Pipeline

by Rohit R Chattopadhyay

The focus of our project now shifts to implementing the data-generation pipeline, which will fetch articles from the NCBI repository to keep our database up to date.

Work Progress

  1. Data Generation Pipeline

    The size of the repository is a major hurdle in this task. We tried to run it in the UCSD server and found out that, it will take at least a year to complete.
    To improve the time, my mentor suggested that we will split the work in different computers so that parallel processing can be done.

  2. Setting up Analytics

    This is one of the major features that will not only help us know our users but also help us in developing a better application for the users.
    Integration of Google Analytics for the static site was done using the official gatsby-plugin-google-analytics.
    For GraphQL API we implemented server-side analytics using Google Analytics API’s Mesurement Protocol API. For now, we have set it up to get geographic location and pageviews of the users.

    Related Issue:

  3. Others

Conclusion

I will be relocating to Kolkata as my classes begin next week. This restricts my work hours towards the project, but I believe proper planning will allow me to handle everything smoothly.

When something is important enough, you do it even if the odds are not in your favour.
~Elon Musk

tags: gsoc - weekly report - coding period