![]() ![]() I selected a location on my machine to save my project and was then able to get started by creating a new Jupyter notebook. In order to get started in DataSpell, you simply attach a directory. ![]() You can also see that this also makes it possible to install your Conda packages using the command line.įeature engineering and modeling with DataSpell notebooks ![]() However, the built in Terminal meant that I was easily able to activate my Conda environment and then pip install the missing package. The only exception was pybaobabdt itself, which is only available through pip for the moment. I was able to add all of these dependencies through the UI, as below. This project required sklearn, numpy, pygraphviz, matplotlib, and pandas. Detailed instructions on how to do this are here. This means I was quickly and easily able set up a Python 3.6 virtual environment using Conda, as you can see below. Luckily, DataSpell offers the ability to set up your Python interpreter using a range of methods, including Conda. However, annoyingly, this package is only compatible with Python 3.6, which I no longer have installed globally on my machine. As you see below, I am now easily able to view the contents and schema of my table.Īs said above, I will be using pybaobab to help me interpret my decision tree models. I will describe how I did this in another blog post, but for now, I want to show how easy it is to connect to this database within DataSpell and then view the table contents.īy clicking on the “Database” tab to the right of the IDE, and then clicking on “New”, I was able to easily create a connection to my database just by specifying the database name, user and password (JetBrains has instructions on how to connect to a range of databases here). I then loaded my data into a table in this database using pandas, sqlalchemy and psycopg2. I really like DataGrip, so the introduction of some of the features of DataGrip into DataSpell is a very welcome addition for me.įor this project, I created a local PostgreSQL database in a Docker container. I wanted to first show a really cool feature of DataSpell, which is its ability to support connections to databases. Obviously, the first step is to access our data. In this analysis, we’ll be reading in the data, doing some simple feature engineering, checking the model accuracy using cross-validation, and exploring a decision tree visualisation package called pybaobab which makes interpreting decision trees easier. To explore what DataSpell can do, I’ll be creating a simple decision tree model using the Heart Failure Prediction DataSet, which was kindly uploaded to Kaggle by Federico Soriano Palacios. I should note that DataSpell also supports R natively, but in this post I’ll be focusing on it’s capabilities with a Python project. ![]() In addition, there are a number of other nice features that make doing a data science project that much nicer, which I’ll show in this blog post. This IDE aims to take those features from P圜harm that we all know and love, such as smart code completion and dependency management, while at the same time making notebooks first-class citizens. I was therefore very excited to hear that JetBrains has released a new IDE called DataSpell specifically designed to support data science work. This is particularly due to the fact that the flow of a data science project often iterates between the research and development phases. However, moving between two different IDEs creates a lot of friction when doing data science work, as it involves task switching and introduces needless distractions. Over the years, I’ve tried working with just notebooks, experimented with Spyder and Rodeo, and have settled on a combination of JupyterLab for my research workflow and P圜harm when I want to write code intended for production. When I was primarily working with R, RStudio was a very nice environment to work with, but when I moved to working in Python I hadn’t been able to find anything close. During my years of working as a data scientist, I’ve tried quite a number of IDEs. ![]()
0 Comments
Leave a Reply. |