UC Davis DataLab Technologies Toolkit

There are many programming languages used in Data Science, including but not limited to:

High-level Languages

  • R – a free programming language and software environment for statistical computing. It provides a variety of statistical and graphical techniques and is highly extensible. R is popular across research domains for developing statistical software and data analyses, and producing publication quality graphics.
  • Python – a free, general-purpose programming language that emphasizes efficiency and code readability. It is object-oriented and often used for web and app development.
  • MATLAB – computing environment and proprietary programming language used most commonly in engineering, physics and economics.
  • Julia – free dynamic programming language for numerical analysis and computing.
  • Scala – genera-purpose language combining object-oriented and functional programming. It runs on the Java platform provides language interoperability with Java.
  • SAS – “Statistical Analysis System,” a proprietary software suite for data analytics. It is used by researchers in many domains to munge, mine and analyze data from a variety of sources. It is particularly popular in the health sciences.

Low-level Languages

Use-specific Languages

  • JavaScript – primarily used in visualization, dashboards, and mashups for webpages.
  • UNIX shell – interactive command language and a scripting language for controlling the executions of the operating system.
  • Regular Expressions – sequence of symbols and characters used to search for a string or pattern within a text.
  • SQL – Structured Query Language is used to communicate with a database and is the standard for many relational database management systems.
  • Pig – used with Apache Hadoop for complex data transformations.
  • XPath – XML Path Language is a syntax for defining parts of an XML document.

Contributions

This research toolkit is maintained by the UC Davis DataLab, and is open for contribution. See how you can contribute on the Github repo.

This toolkit has been made possible thanks to contributions by: