Introduction to Data Science – Python Tutorial
Today is a special day because we will enter the unknown plains of data science with our friendly snake Python. True, snakes have a bad reputation because of that whole deal with exiling from heaven and all that but Python is a very nice language aimed for both professionals and beginners with good syntactic rules. I feel it makes me a good programmer while writing it.
With this article I am going to teach the basics about python programming to you. Because data science uses either Python or R. R is also a good language, mind you, but I find Python more versatile. After you learn it you have the option to build online applications with flask and django, make desktop apps with tkinter, qt and other graphic libraries and do data science with pandas and numpy
On these days I aim to work in the Python field so I am going to write articles on it. This is the sum of what I’ve learned with Tim Buchalka’s Complete Python Masterclass from Udemy. I haven’t finished it yet, I still have two chapters to go (Database Manipulation and Lambdas and List Comprehension) but I think I am able to write a basic language tutorial which can help you understand what I am going to write in the coming days. To be fair, I am not aiming to make you a bona fide python programmer, I am not a teacher, but this is aimed to give a sense of what is going on and whet your appetite. If you are interested in learning python with all of its nitty gritty details, I recommend Tim Buchalka’s course.
So without further ado:
We are going to use Python 3. I know, I know there is a great deal of schism in the Python world but 3 is the selected path so we are going to use it. It is very simple to install, you go to Python.org and download Python 3 and run the setup. Be sure to click the option which adds python to the path.
We are going to use the community edition here. You should download it from jetbrains’s main site. And set it up. After you run it, you should go to Configure-> Plugins and search for Python community edition. And let it install this plugin. When you restart the ide, you are all set to use python with intellij.
“this is variable, you can store shit there…” – Naska
As my friend Naska succinctly put it, variables are there to store values. Why use variables? Let me show you with a very basic example:
I know this doesn’t do anything but imagine if you want to change the 4 there. You are going to find every instance you had written 4 and change it to the value you have in mind. So impractical and time consuming! Also there are times we want to hardcode the 4 in some places, so a basic find&replace won’t save us here. But if you had written this snippet via variables:
All you have to do is change the value of y and it is changed in everywhere you use y. As you can see it is very easy to declare a variable in python. All you have to do is give it a name, put an equals and write your value. There are certain rules with variable naming:
- you can’t start with a number, but can use numbers afterwards: 2x is not a valid variable name but x2 is.
- some words are keywords in python and you should not use it. You can, if you have a specific usage in mind but usually it is a bad idea
Decision Making and Flow
Python uses if, elif and else to make decisions. Flow is controlled via for and while. Let us take a brief look.
This will take some explaining. The if construct uses a different syntax:
- x == y means x is equal to y
- x != y means x is not equal to y
- x >= y means x is greater than or equal to y
- x <= y means x is smaller than or equal to y
- x > y means x is greater than y
- x < y means is is smaller than y
when you use them the comparison will give you either a True or a False. True means use the statement I’ve written under this condition. False means go to the other branch. If no branches are applicable, it uses else.
With for we used a range function built in to Python. Range takes the first number as a starting point, and second number is noninclusive end point. So this gives us a list of [0,1,2,3,4,5,6,7,8,9] and assigns each value to i which we print. You also can go stepping with range, range(0,10,2) means give me numbers starting from 0 and increase 2 so we’ll get [0,2,4,6,8] as a list.
While is a construct which runs until its starting condition is set to False. In the example we run it until x is 0. Be aware that you can have an unending loop with while if the starting condition is never broken.
Next we’ll take a look at the data structures of python, namely lists, tuples and dictionaries.
Lists are structures which you can put multiple values in it. I have put 1,2,3 to it but I can as easily put 1, “a”, 2 to it as well. You can append and remove values from it as well.
Tuples are lists but they are non-modifiable unless you create a new tuple with the same name. This is useful if you want to have a non modifiable value pool.
Dictionaries are named lists. With lists I have to know the index (place) of the value if I want to get it. With dictionaries I can specify which value I want with its name. You can also add values to a dictionary via:
this assigns the key d to the value 4.
So, these are the basics, the VERY basics, of the python language and I’m sorry if I wasn’t clear on some points. It’s been years since I’ve written a programming article so I apologize for any inconvenience.