Data Loading and Storage#

Accessing data is the first prerequisite when dealing with Data Science. A library that handles data should be able to both read and write data on file system in various file formats. The libraries for data manipulation discussed until now, Kotlin DataFrame and Python’s pandas, provides those functionalities.

Koltin DataFrame provides more basic functions than pandas (as said before, DataFrame is quite young and currently under development), especially when it comes to file format. In the current version (v0.10) data reading and writing is supported in four file formats: CSV, JSON, Excel Spreadsheets format and Apache Arrow. On the other hand, pandas offers support for all four of them and even more (see docs for a full list of supported file formats). Note that both DataFrame and pandas can read from file system and URLs.

You can find all the references for DataFrame I/O in the user guide, and plenty of documentation about pandas reading and writing data in this section.