View the Project on GitHub cannin/ihop-reach

29 May 2019

Week Zero | Welcome GSoC

by Rohit R Chattopadhyay

About Me

I am an undergraduate student, pursuing Computer Science and Engineering at Jadavpur University, Kolkata, India. My interests lie in application development, alongside which I am interested in exploring the computer science world of machine learning and image processing.

GSoC Project

GSoC Project URL Work Repository GSoC Project Proposal

Mentor: Augustin Luna (@cannin)

The aim of the project is to make a web app which will be an interface to interact with biological data extracted from biomedical literature.

Stack to be used:

Community Bonding

This period of GSoC started after the project announcements. I was given some material to study, which helped me a lot to understand what I was about to build in the summer.

I used this period, to set up my development environment and discuss our approach with Mentor. I was given the dataset, which I analysed and made a Python script to import it to MongoDB.

This report marks the end of the Community Bonding period.

My end-semester ended on 22nd May, thus I was not able to devote my whole time to the project. I will cover it up by putting in extra hours during the Coding period.


During the community bonding period, I was assigned the following work:

  1. Dataset import script to MongoDB

    Status: Complete
    The script is made using Python. After consulting with Augustin, I implemented a method to shred the payload, to make it easy for the user as well for the MongoDB drivers. For the whole dataset(16GB), it took around 20hours on my 8GB Ubuntu system to import the JSON documents to local MongoDB.

    Related Issues:

  2. Setup REST API for MongoDB

    Status: Final stage
    The REST API is made using the python-Eve framework. Following three endpoints have been set up:

    1. /articles
      to retrieve all articles
    2. /articles/{articleId}
      to retrieve an articles by Document ID
    3. /articles/identifier/{identifierKey}
      to retrieve one or more articles by identifier

    For the identifier endpoint, a 301 redirection is set up to endpoint /articles with suitable filters. Swagger is used for documenting the REST API. Link to Swagger documentation.

    Related Issues:

  3. Create Docker images for MongoDB and REST API

    Status: Final stage
    The Docker images for the MongoDB database and REST API was created using docker-compose. After required testing, it has been hosted in Docker Hub.

    Docker Hub links:

    Related Issues:

  4. Frontend Development

    Status: At Early stage
    This will be my highest priority when the coding period begins. Till now only wireframes have been confirmed, and basic work of the GatsbyJS setup has been done.

    Related Issues:


It has been a great learning experience so far. I had very less experience in working with python for web services, but I am using it now since the first day of Google Summer of Code 2019. I am enjoying the journey and I hope that with time and experience my work efficiency and quality will increase.
My mentor, Augustin has constantly helped me by providing materials. He is just a message away, whenever I require his guidance. I am indebted to him, for the patience he shows.

In real open source, you have the right to control your own destiny.
~Linus Torvalds

tags: gsoc - weekly report - community bonding