Skip to content
Toby Dylan Hocking edited this page Feb 11, 2016 · 2 revisions

Application

Why does your org want to participate in Google Summer of Code?

R is a large and complex software ecosystem involving a base system, several thousand add-on packages and a number of tools and information channels, mostly web-based. We expect to develop some R packages and enhance R’s web presence, as we have done in previous years with GSOC.

How many potential mentors have agreed to mentor this year?

20+

The number of mentors will probably be the same as in previous years, between 20 and 40 depending on how many project proposals are submitted.

How will you keep mentors engaged with their students?

We expect that mentors will be self-motivated to stay engaged with their students throughout GSOC. The main reason why is that mentors volunteer their time to create project proposals on our wiki page, and they are usually looking for students to help writing code, tests, and documentation for new or existing packages. One example from this year is Marek Gagolewski, who is the author of the stringi package and has already found a potential student for a project about regular expressions https://github.com/rstats-gsoc/gsoc2016/wiki/re2-regular-expressions

How will you help your students stay on schedule to complete their projects?

We require that students provide a detailed timeline in their project proposals. Furthermore, we suggest weekly calls between mentors and students, so that students can ask for and get help with their projects.

How will you get your students involved in your community during GSoC?

Often our GSOC students are already involved via R User Groups, college or university courses that involve R, and the UseR! conferences. We will recommend that new students blog about their project on R-bloggers, and get involved with some of R’s many mailing lists.

How will you keep students involved with your community after GSoC?

R has many packages, and volunteer developers move among these from time to time. We would be happy to have students stay with the overall R family rather than insist they stick with the particular package that they develop for GSOC.

In the past, we have had many GSOC students stay involved in the R community. For example, some former GSOC students (e.g. Ian Fellows, Susan VanderPlas) have returned in subsequent years to become GSOC mentors. Also, Yixuan Qiu was a student in GSOC2011 and has set up an R user group at his home institution in Beijing.

Has your org been accepted as a mentoring org in Google Summer of Code before?

Yes

Which years did you participate?

2008-2015

What is your success/fail rate?

Historically we have had very few failures. For example in 2011 we failed 1/14 students and in 2012 we failed 1/16 students. However since 2013 we have instituted a policy of at least two mentors per student, and we have seen the failure rate drop to zero, even though there are more students than ever (24 in 2015).

Are you part of a foundation/umbrella organization?

R is an official part of the Free Software Foundation’s GNU project, and the R Foundation is a not-for-profit organization working in the public interest. It has been founded by the members of the R Development Core Team in order to

  1. Provide support for the R project and other innovations in statistical

computing.

  1. Provide a reference point for interacting with the R development community.
  2. Hold and administer the copyright of R software and documentation.

What year was your project started?

1993

Profile

URL

https://www.r-project.org/

Tagline

R is a free software environment for statistical computing and graphics

Logo

https://raw.githubusercontent.com/wiki/rstats-gsoc/gsoc2015/Rlogo.png

Primary open-source license

GPL-3

Org Category

Programming languages and development tools

Technology tags

r-project, c, c++, fortran, javascript

topic tags

data science, visualization, statistics, graphics, machine learning

ideas list

https://github.com/rstats-gsoc/gsoc2016/wiki/table-of-proposed-coding-projects

Short description

R provides a wide variety of statistical (linear and nonlinear modelling, classical statistical tests, time-series analysis, classification, clustering, …) and graphical techniques, and is highly extensible. The S language is often the vehicle of choice for research in statistical methodology, and R provides an Open Source route to participation in that activity.

Long description

R is an integrated suite of software facilities for data manipulation, calculation and graphical display. It includes

an effective data handling and storage facility, a suite of operators for calculations on arrays, in particular matrices, a large, coherent, integrated collection of intermediate tools for data analysis, graphical facilities for data analysis and display either on-screen or on hardcopy, and a well-developed, simple and effective programming language which includes conditionals, loops, user-defined recursive functions and input and output facilities.

The term “environment” is intended to characterize it as a fully planned and coherent system, rather than an incremental accretion of very specific and inflexible tools, as is frequently the case with other data analysis software.

R, like S, is designed around a true computer language, and it allows users to add additional functionality by defining new functions. Much of the system is itself written in the R dialect of S, which makes it easy for users to follow the algorithmic choices made. For computationally-intensive tasks, C, C++ and Fortran code can be linked and called at run time. Advanced users can write C code to manipulate R objects directly.

Many users think of R as a statistics system. We prefer to think of it of an environment within which statistical techniques are implemented. R can be extended (easily) via packages. There are about eight packages supplied with the R distribution and many more are available through the CRAN family of Internet sites covering a very wide range of modern statistics.

R has its own LaTeX-like documentation format, which is used to supply comprehensive documentation, both on-line in a number of formats and in hardcopy.

Application instructions

  1. look for a project that needs a student on

https://github.com/rstats-gsoc/gsoc2016/wiki/table-of-proposed-coding-projects

  1. Each project should have “tests” students can complete to demonstrate

relevant skills. After completing at least one test, please post your test results to a github repo, and add a link to your test results on the wiki.

  1. Send an email to the mentors of the project. Include a link to your

test results, and explain why you are interested in the project.

  1. Do NOT submit any applications to google without getting approval

from the mentors. If the mentors judge that you are capable of the project, then they will respond and help you to write a proposal to submit to Google. It should include most of the details from the project proposal wiki page, and additionally a detailed timeline that explains your plan for writing code, documentation, and tests.

  1. Once your mentors have proof-read your proposal, submit it to google

https://summerofcode.withgoogle.com/

Proposal tags

new package, existing package, visualization, machine learning, data cleaning, statistics, finance, optimization, reproducible research, bioinformatics.

Mailing list

https://github.com/rstats-gsoc/gsoc2016/wiki

General email

[email protected]

Blog URL

http://www.r-bloggers.com/

Clone this wiki locally