-
Notifications
You must be signed in to change notification settings - Fork 0
GSOC 2014: Tatoeba Website API Creation
I prefer you see this link for my proposal.
My name is Pankaj Gudlani, a second year undergraduate at IIT Roorkee, India.
This is my proposal for GSoC 2014: Tatoeba Website API Creation project.
- Email : [email protected]
- Github : pgudlani
- IRC : pgudlani
- Username on tatoeba.org : pankaj_gudlani
-
Other Contact Info
- Mobile Number : +91 8979522653
- Timezone : GMT/UTC + 05:30
- Natural languages known : English, Hindi
The aim of the project is to construct a real API. This work will help developers, and users, build their own tools, websites, webservices or mobile applications based on the Tatoeba architecture and data. The API will be fully RESTful.
For now, The Tatoeba database is used either through the main website interface or through data dumps.
After the API is created, The site will be fully working on the response through API.
- A web application that provides a set of API calls for data stored in the current database.
- The API should cover all data available through the current web interface, including sentence comments, wall comments, recently-added sentences and top recent contributors.
- A method for search - A search for sentences optionally including their translations.
- A method for sentence details - Request details for a single or many sentences, by id.
- A method for comment details - Request details for a single or many comments, by id.
- A method for User details - Get a single or many user profiles, by id.
- A method for searching users - For searching users by name.
- Other functions to fetch wall, comment threads and reply threads.
- Use REST(REpresentational State Transfer) for better functionality of URLs.
- Use extra functions like ‘embed’, ‘sort’, ’fields’, 'depth' described further in the code Architecture.
- Minification of the response.
- Pagination through 'start' and 'count'.
- All invalid requests are handled by Error Codes and messages.
I mainly focus on making this API RESTful.
For this I am Thinking to use Django REST Framework completely based on Python. This is the best Framework for creating an API that I have ever come across. I have been using this for like an year now. Some useful features are:
- Ofcourse! Its RESTful.
- Some of the common features like other frameworks are models, views, urls.
- Generic Views are much useful containing ListCreateView, RetrieveUpdateDestroyAPIView etc. Names explain for themselves.
- Has serializers which can return response in any format you want including json, xml, yaml etc.
- Model Serializers can be used and fields to be returned can be specified.
- Extra fields can be returned.
- No need of any hectic authentication system. Django REST User Authentication can be used.
- It is flexible for using any third party apps.
and many more.
Alternatively, I could also use any of the PHP frameworks - Guzzle or Slim.
Authenication token will be refered with all the requests that require user logged in.
After GSoC is over and this project is over, This API would be running for 'sentences', 'comments', 'wallposts', 'users' and 'search'. Anything other than these added in tatoeba after this summer would need some addition to the code for API response.
I prefer that I get the opportunity to do that as I know reading another person's code is difficult and I would be happy if you let me do the maintenance of my code.
-
request
type : "get", "id" : [1, 2, 3, ... , 10] // ... List of sentences needed.
response
"status" : 200 // ... Status OK. "sentences" : [ { "id" : 1, "text" : "Ik heb honger.", "lang" : "nld", "tags" : [34, 56], // ... Primary key of tags. "audio" : 1, "user" : { "id" : 123, "username" : "ronnal42" }, "created" : "2013-04-15 01:12:34", "modified" : "2013-06-01 07:14:01", "direct": [985, 34232, 34224], // ... Primary key of direct sentences. "indirect" : [278, 8676, 3242], // ... Primary key of indirect sentences. "comments" : [20,30,40] // ... Primary key of comments. }, { "id" : 2, "text" : "Honger Ik ger.", "lang" : "nld", "tags" : [10, 16], "audio" : 0, "user" : { "id" : 389, "username" : "harry" }, "created" : "2011-12-25 11:02:40", "modified" : "2011-01-31 08:14:23", "direct": [985, 34232, 34224], "indirect" : [278, 8676, 3242], "comments" : [22, 11, 33] }, // ... Rest of all 100 sentences here. ]
-
request
type : "post", //...Feature of RESTful API(same url used with different method-type): POST creates a new sentence. "sentence" : { "id" : 1, "text" : "Ik heb honger.", "lang" : "nld", "tags" : [34, 56], // ... Primary key of tags. "audio" : 1, "user" : { "id" : 123, "username" : "ronnal42" }, "created" : "2013-04-15 01:12:34", "modified" : "2013-06-01 07:14:01", "direct": [985, 34232, 34224], // ... Primary key of direct sentences. "indirect" : [278, 8676, 3242], // ... Primary key of indirect sentences. "comments" : [20,30,40] // ... Primary key of comments. }
response
"status" : 201 // Created new instance.
-
URL : /comments/32/?embed=sentence
request
type : "get" "embed" : "sentence" // sentence instance in comment instance is expanded. // if embed=sentence&sentence.tags, then sentence and tags inside // sentence would have been expanded.
response
"status" : 200 "comment" : { "id" : 32, "sentence" : { "id" : 5352, "text" : "Ik heb honger.", "lang" : "nld", "tags" : [34, 56], "audio" : 1, "user" : { "id" : 123, "username" : "ronnal42" }, "created" : "2013-04-15 01:12:34", "modified" : "2013-06-01 07:14:01", "direct": [985, 34232, 34224], "indirect" : [278, 8676, 3242], "comments" : [20,32,40] }, "lang" : "en", "text" : "Skyrim belongs to the Nords!", "user" : { "id" : 1, "username" : "lydia" }, "created" : "2013-12-20 13:01:43", "modified" : "2013-12-20 13:01:43" }
-
URL : /sentences/1/comments/?sort=-modified
request
type : "get" "sort" : "-modified" // Comments on sentence(1) ordered in descending order are rendered. // '-' indicated descending order.
response
"comments" : [ { "id" : 20, "sentence" : [1] "lang" : "en", "text" : "Skyrim belongs to the Nords!", "user" : { "id" : 1, "username" : "lydia" }, "created" : "2013-12-20 13:01:43", "modified" : "2013-12-20 13:01:43" // #1 with descreasing order by modified. }, { "id" : 40, "sentence" : [1] "lang" : "en", "text" : "YoYo!", "user" : { "id" : 3, "username" : "abcd" }, "created" : "2013-12-20 13:01:43", "modified" : "2013-02-10 13:01:43" // #2 with descreasing order by modified. }, // ... Rest comments with "modified" in descending order.
-
URL : /users/2/?fields=name,img
request
type : "get" "fields" : "name,img"
response
"status" : 200 "user" : { "name" : "Bob Smith", "img" : "http://www.tatoeba.org/img/usrs/465465.jpg" }
A typical example using 'embed', 'sort' and 'fields'
-
'sentences' ( /sentences/?embed=direct&sort=created&fields=id,text,lang,user.username,created,direct )
request
type : "get", "id" : [1, 2, 3, ..., 100], "embed" : "direct", "sort" : "created", "fields" : "id,text,lang,user.username,created,direct"
response
"status" : 200, "sentences" : [ { "id" : 2, "text" : "Honger Ik ger.", "lang" : "nld", "user" : { "username" : "harry" }, "created" : "2011-12-25 11:02:40", "direct": [ // embedded 'direct' { "id" : 985, "user" : { "username" : "hermonie granger" }, "text" : "I am hungry.", "lang" : "eng", } ] }, { "id" : 1, "text" : "Ik heb honger.", "lang" : "nld", "user" : { "username" : "ronnal42" }, "created" : "2013-04-15 01:12:34", "direct": [ { "id" : 324, "user" : { "username" : "pgudlani" }, "text" : "I love RESTful API.", "lang" : "eng", }, // Rest direct sentences here. ] }, // Rest sentences here
Same type of method applies to 'comments', 'user profiles'.
## Functions for Wall * ### **'/wall/'**
URL : /wall/
request
~~~
type : "get"
~~~
response
~~~
"wallPosts" : [ // All wallposts.
{
"id" : 00001,
"user" :
{
"id" : 1,
"username" : "low"
},
"created" : "2011-12-21 15:06:21",
"modified" : "2013-01-15 09:04:30",
"text" : "So how is the work on the API going?",
"replies" : [42342, 3324, 42422, 543534]
},
{
"id" : 42342,
"user" :
{
"id" : 13,
"username" : "newuser"
},
"created" : "2011-12-21 15:06:21",
"modified" : "2013-01-15 09:04:30",
"text" : "Good.",
"replies" : [342, 531]
},
// ... Rest wallposts here
]
~~~
'embed', 'sort' and 'fields' are used as shown above.
-
URL : /wall/20/?depth=2
request
type : "get" "depth" : 2
response
"status" : 200 "wallPost" : { "id" : 20, // Depth 0 "user" : { "id" : 1, "username" : "low" }, "created" : "2011-12-21 15:06:21", "modified" : "2013-01-15 09:04:30", "text" : "The US government spies on the entire world.", "replies" : [ { "id" : 42342, // Depth 1 "user" : { "id" : 13, "username" : "newuser" }, "created" : "2011-12-21 15:06:21", "modified" : "2013-01-15 09:04:30", "text" : "Good.", "replies" : [ { "id" : 00001, // Depth 2 "user" : { "id" : 1, "username" : "low" }, "created" : "2011-12-21 15:06:21", "modified" : "2013-01-15 09:04:30", "text" : "So how is the work on the API going?", "replies" : [42342, 3324, 42422, 543534]342, 531] }, // ... Rest replies here }, // ... Rest replies are here ] }
Two extra attributes 'start' and 'count' will be sent to paginate the results.
Default: start=1 and count=10.
request
type : "get",
"start" : 3,
"count" : 1
response
"status" : 200 // ... Status OK.
"sentences" : [
{
"id" : 3, // results start from 3(inclusive)
"text" : "Ik heb honger.",
"lang" : "nld",
"tags" : [34, 56], // ... Primary key of tags.
"audio" : 1,
"user" :
{
"id" : 123,
"username" : "ronnal42"
},
"created" : "2013-04-15 01:12:34",
"modified" : "2013-06-01 07:14:01",
"direct": [985, 34232, 34224], // ... Primary key of direct sentences.
"indirect" : [278, 8676, 3242], // ... Primary key of indirect sentences.
"comments" : [20,30,40] // ... Primary key of comments.
}
// Only 1 result as count=1
]
-
200 OK
: Response to a successful GET, PUT, PATCH or DELETE. Can also be used for a POST that doesn't result in creation. -
201 Created
: Response to a POST that results in a creation. -
403 Forbidden
: User not authenticated. -
404 Not Found
: Not Found.etc.
A Typical error message will look like this.
{
"message" : "Multiple Errors",
"errors" : [
{
"code" : 121,
"field" : "sentence",
"message" : "Sentence Not Found"
},
{
"code" : 123,
"field" : "lang",
"message" : "Incorrect language"
}
]
}
URL | GET | POST | PUT | DELETE |
---|---|---|---|---|
/sentences/ | Get data of sentences specified/first 50(if not specified). | Make A new sentence instance with data specified. | Not generally used. | Delete all sentences(not applicable). |
/sentences/12/ | Get data of sentence(id=12). | Not generally used. | Replace the sentence, or if it doesn't exist, create it. | Delete the sentence(id=12). |
/sentences/12/comments/ | Get data of all comments on sentence(id=12). | Make a new comment on sentence(id=12). | Not generally used. | Delete all comments on sentence(id=12). |
/sentences/12/comments/9/ or /comments/9/ |
Get data of comment(9) on sentence(12). | Not generally used. | Replace the comment, or if it doesn't exist, create it. | Delete the comment(id=9). |
/search/?user=bob | Get All users with name containing 'bob'. | -- | -- | -- |
/search/?q=eat | Get All sentences containing 'eat'. | -- | -- | -- |
/wall/ | Get All wallposts. | Make a new wallpost with data specified. | Not generally used. | Delete all wallposts(not applicable). |
/wall/30/ | Get data of wallpost(id=30). | Not generally used. | Replace the wallpost, or if it doesn't exist, create it. | Delete the wallpost(id=30). |
I'll use the concept of TDD(Test-driven development) for testing my code.
TDD, in its most basic terms, is the process of implementing code by writing your tests first, seeing them fail, then writing the code to make the tests pass.
Steps:
- Finalize the response of the function module.
- Write test cases according to that response.
- Test your code (first time ofcourse it fails as no module is yet formed).
- Make All The Tests pass.
- Code Refactoring
- Repeat steps 3-5 until all tests pass.
These steps will ensure that our functionality is working correctly.
Example
class SentenceView(ListCreateAPIView):
def get(self):
pass
import unittest
from views import SentenceView
class TddSentences(unittest.TestCase):
sentences = # ...
def test_sentences_method_result(self):
sent = SentenceView()
result = sent.get()
self.assertEqual(sentences, result)
Assertion Error : sentences!=None
Then to remove this we modify our code
class SentenceView(ListCreateAPIView):
def get(self):
# ...
return sentences
Test Successful.
Thus the code gets checked and refactored after each feature is added.
As we will be using Django REST Framework, we can use pytest-django module or simply python's unittest module.
Time Period | Goals |
---|---|
April 21 - May 18(4 weeks) | Get familiar with code base, read documentation and bond with community and work on models and serializers. |
May 19 - June 01(2 weeks) | Complete the views for sentences. (Including all 4 method-types- GET,POST etc.) |
June 02 - June 15(2 weeks) | Complete the views for comments/replies. (Including all 4 method-types.) |
Juke 16 - June 22(1 week) | Testing and solving bugs till now. |
June 23 | Mid Term Evaluation. |
June 24 - June 30(1 week) | Implement Search(Both users and sentences) and all error code messages. |
July 1 - July 14(2 week) | Implement other functions 'embed', 'sort', 'fields' |
July 15 - July 28(2 weeks) | Final Testing and further documentation. |
July 29 - Aug 18(3 weeks) | Time given if work is left and implement Minification and new ideas. |
Final Submission.
The extra time of last 5 weeks is also a spare time if any work stated before get piled on and to complete the work in given time.
To be at the safer side, I still prefer to do most of my work before Mid term Evaluation.
I am Pankaj Gudlani, a second year undergraduate, currently pursuing my B.Tech in Computer Science Engineering at Indian Institute of Technology Roorkee. I am also a part of Information Management Group(IMG), a student body which manages all the internet and intranet activities of IIT Roorkee.
I got interest in competitive programming also. I was selected for ACM-ICPC 2013 Reigonals(Amritapuri Site). My team was ranked 93 all over India in the online round.
I am familiar with C++, JAVA, PHP, Python, Django Framework(Python), HTML, CSS, Javascript, jQuery. I am working with git for around a year now, thus being quite familiar with it now.
Being a part of IMG in this past year, I have been working on projects based on Python-Django Framework.
Recently, I worked on an online music player website for our college.
I did its complete backend in Python-Django REST API with its design Integration and all the required javascript from scratch.
The player was fully based on API rendered by backend.
Unfortunately, It is working on the intranet inside the IIT Roorkee campus. So, I dont have any URL to show you that.
I am new to open source. This project is my first opportunity to bond with an open source organization. Through GSOC I wanted to step into the world of open source and I was very much intreseted in your API creation project as I have a past experience in creating API and it would be easier and better step for me into the open source community.
I am devoted and committed to the work I do. So, I'll work on making Tatoeba API as good as it can be. It will surely help me enhance my skills.
I will be logged in to my gmail and irc during my working hours(12 noon-5 a.m. and 9 p.m.-3 a.m. IST).
I will be in touch with the community over the mailing lists and irc for any feedback, suggestions and queries. All the code will be hosted on Github, so that anyone can easily track my progress and give feedback.
Besides Your project being about API creation, I am much interested to work with Tatoeba as I like this the concept of it, i.e., translation of a single sentence to multiple foreign languages. I have been logged into your IRC much of my time, I know I am not good at conversation starters but I like the environment of this organization. I would like to work with Tatoeba.
And about your question of choosing me! I want to experience working in open source this summer. I am willing to enhance my skills, using my talent and ability. I even worked for a project similar to this last year. I am the right person for this kind of project. I'll do my best to be an asset to this organization.
No, I am not applying anywhere else other than Tatoeba. I didn't have enough time to prepare for any other project.
Being a second yearite, this summer(May-mid July) I have nothing extra planned(no internship load). I can devote myself completely to Tatoeba API Project during that time. I plan to invest about 50 hrs a week as I would prefer to do my 80% work in this period as I would be having my classes running from mid July. I will be having 20 hrs a week reserved for my classes. So, my working hours would come upto 35-40 hrs a week(for my rest 20% work). That will complete all tasks listed in my Timeline easily.
As I am too attached to the code I write. So after GSoC also, I hope on contributing to Tatoeba and specifically improve its API as much as I can.