PYTHON IN EARTH SCIENCE A BRIEF INTRODUCTION by Sujan Koirala and Jake Nelson V ersion 1.0 February, 2017. Department of Biogeochemical Integration, Max Planck Institute for Biogeochemistry Jena, Germany
FOREWORD This document is a summary of our experiences in learning to use Python over last several years. It is not intended to be a standalone document that will help the user to solve every problem. What we hope is to encourage new users to delve into a wonderful programming language. Sujan Koirala and Jake Nelson [email protected] [email protected] Jena, Germany February, 2017 i
CONTENTS 1 Installation and Package Management 1 1.1 Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.1.1 Python, a brief history . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.1.2 Python 2, and Python 3 . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.2 Environments and packages . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.2.1 Using other people’s code . . . . . . . . . . . . . . . . . . . . . . . . 3 1.2.2 Which package manager to use? . . . . . . . . . . . . . . . . . . . . 4 1.2.3 Versions, packages, environments, why so complicated? . . . . . 4 1.3 Installing Anaconda . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 1.3.1 Windows installation notes . . . . . . . . . . . . . . . . . . . . . . . 5 1.3.2 OSX installation notes . . . . . . . . . . . . . . . . . . . . . . . . . . 5 1.3.3 Linux installation notes . . . . . . . . . . . . . . . . . . . . . . . . . 5 1.4 Creating your first environment . . . . . . . . . . . . . . . . . . . . . . . . . 6 1.4.1 Installing a package. . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 2 Python Data Types 8 2.1 Basic Data Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 2.1.1 Boolean Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 2.1.2 Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 2.1.3 Strings. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 2.1.4 Bytes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 2.2 Combined Data Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 2.2.1 Lists . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 2.2.2 Tuples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 2.2.3 Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 2.2.4 Dictionaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 2.2.5 Arrays . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 ii
CONTENTS iii 3 Input/Output of files 30 3.1 Read Text File . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 3.1.1 Plain Text . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 3.1.2 Comma Separated Text . . . . . . . . . . . . . . . . . . . . . . . . . 31 3.1.3 Unstructured Text. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 3.2 Save Text File . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 3.3 Read Binary Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 3.4 Write Binary Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 3.5 Read NetCDF Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 3.6 Write NetCDF Data. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 3.7 Read MatLab Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 3.8 Read Excel Data. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 4 Data Operations in Python 36 4.1 Size and Shape . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 4.2 Slicing and Dicing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 4.3 Built-in Mathematical Functions . . . . . . . . . . . . . . . . . . . . . . . . 42 4.4 Matrix operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 4.5 String Operations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48 4.6 Other Useful Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 5 Essential Python Scripting 51 5.1 Control Flow Tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 5.1.1 if Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 5.1.2 for Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 5.1.3 while Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 5.1.4 break and continue . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 5.1.5 range . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 5.2 Python Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 5.3 Python Modules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 5.4 Python Classes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 5.5 Additional Relevant Modules. . . . . . . . . . . . . . . . . . . . . . . . . . . 58 5.5.1 sys Module . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
CONTENTS iv 5.5.2 os Module . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 5.5.3 Errors and Exceptions . . . . . . . . . . . . . . . . . . . . . . . . . . 60 6 Advanced Statistics and Machine Learning 61 6.1 Quick overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62 6.1.1 required packages. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62 6.1.2 Overview of data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62 6.2 Import and prepare the data . . . . . . . . . . . . . . . . . . . . . . . . . . . 63 6.3 Setting up the gapfillers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 6.4 Actually gapfilling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 6.5 And now the plots! . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68 6.5.1 scatter plots! . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69 6.5.2 Distributions with KDE . . . . . . . . . . . . . . . . . . . . . . . . . 70 6.6 Bonus points! . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70 7 Data Visualization and Plotting 71 7.1 Plotting a simple figure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72 7.2 Multiple plots in a figure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73 7.3 Plot with Dates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74 7.4 Scatter Plots . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75 7.5 Playing with the Elements . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76 7.6 Map Map Map! . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77 7.6.1 Global Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77 7.6.2 Customizing a Colorbar . . . . . . . . . . . . . . . . . . . . . . . . . 79
LISTINGS v
1 INSTALLATION OF PYTHON AND PACKAGE MANAGEMENT This chapter provides information on the installation of core Python and addition of packages. 1
1.1. Introduction 2 1.1. INTRODUCTION If you are currently using a recent Mac or Linux operating system, open a terminal and type, :∼ $ python and you should see something like, Python 2.7.12 Type "help", "copyright", "credits" or "license" for more information. >>> You have just entered the native installation of Python on your computer, no extra steps needed. This is because, though it is a great tool for earth science and data analytics, Python is a general purpose language that is used by all sorts of programs and utilities. While is it nice the Python is a very open and widely used tool, one should also take care that this native installation is not modified to the point that the other useful and essential utilities that depend on it are disrupted. For instance, a package or command may no longer be installed where it originally was by the operating system. For this reason, this chapter will outline how to install a modern version of Python, as well as many packages useful for data science, in a tidy environment all it’s own. 1.1.1. PYTHON, A BRIEF HISTORY As the story goes, in 1989 Guido van Rossum decided he needed something to do over the Christmas holidays, and instead of reading a nice book or learning to brew his own beer, he decided to develop the scripting language with the name of Python, named after Monty Python’s Flying Circus. Since then, Python has come to be know by several core principles, most notable of which are the focus on readability and requiring fewer lines of code. Because of this, lines of code can almost be read as a sentence in plain English. For example, if I would like to add one to every number in my list, I would, >>> [number+1 for number in MyList] Though this may look daunting if you are new to coding, if you read it out loud you can almost hear what it does. And along this line, the Python philosophy tends not to be that there are many clever ways to do one thing, but one very clear way. Because of these ideologies Python can be a very useful and rewarding coding language to learn,
1.2. Environments and packages 3 which is reflected in it’s popularity. 1.1.2. PYTHON 2, AND PYTHON 3 As you start in Python, you will quickly find yourself wondering why there are two different versions being used. Python 2 was released in 2000 as the first major update, and many programs have been written using this flavor. However, in 2010 with the release of version 2.7, it was announce that Python 2 would be phased out in favor of the new Python 3, so there is no plan for a version 2.8. This major update from 2 to 3 was made to change some small yet significant things to the language, such as how it handles text data and iterates through lists and dictionaries. The idea is that it is better to update a language to fix things, than always dealing with small bugs because of refusal to change. As Python 2 is scheduled to be retired in the next few years, this manual will focus on using Python 3. This does mean that our Python 3 code may not work with our native Python 2 installation, but in the realm of data science, as you will be using so many specialized packages of code, this would be the case anyway. In the end, you will be using a self contained Python environment that contains our Python installation, as well as all the code you will be using, in one neat little box. 1.2. ENVIRONMENTS AND PACKAGES 1.2.1. USING OTHER PEOPLE’S CODE As Python is a general purpose language, the basic functionality out of the box is also very general: things such as basic math, file manipulation, and printing output. So if you want to do anything beyond what is defined in the core language, you need to write our own little bit of code to do it. However, as you are taking a Python course, you can assume that the first time you need a bit of code that the core Python doesn’t have built in, something like calculating the standard deviation of a set of numbers, someone else will have probably run into the same issue before you. Luckily, the Python community is very active in writing these bits of code and sharing them so that you don’t have to write every function from scratch. Not only that, many of these little bits of code have been bundled into large collections of code called packages. For example, the mean, median, standard deviation, percentile, and other statistical functions are
1.2. Environments and packages 4 already built into a package called NumPy (Numerical Python) which gives you access to a whole bunch of bits of code. Not only that, there entire package managers that will take care of downloading and installing the package, as well as making sure it plays nice with all the other packages you are using, all you have to do is tell it which package! 1.2.2. WHICH PACKAGE MANAGER TO USE? Probably the most common package manager is called pip. pip is a wonderfully useful tool that is widely supported, which you will not use. Instead you will use Anaconda for the following reasons: • Anaconda is designed for data science. • Anaconda will handle not only the Python packages, but non-Python thing such as HDF5 (which allows us to read some data files) and the Math Kernel Library. It will even manage an R installation. • Anaconda also manages environments, which: Keep our Python installations working together. Keep separate collections of packages in case some don’t work well together. Are duplicatable and exportable, so our work can be replicated. 1.2.3. VERSIONS, PACKAGES, ENVIRONMENTS, WHY SO COMPLICATED? Though this all may seem a bit complicated to just make a plot or do some math, it becomes necessary because of two main issues: the computer needs to know where to look for things, and what to call them. Just like when you go back to look at the wonderful photos you took on vacation 3 years ago only to find a giant mess of folders and sub-folders to go through, your computer also has to look through all it’s memory to find where a bit of code might be located. When properly managed, all the files are put in the appropriate place, where the computer can easily find them. Similarly, if I have a file in the folder Photos/ called MyBestPicture.jpg, and I have a different file in the folder Photos2/ called MyBestPicture.jpg, when I tell my computer I want MyBestPicture.jpg, it has no idea
1.3. Installing Anaconda 5 which one you mean. In this way, by using these tools, you keep everything nice and tidy. 1.3. INSTALLING ANACONDA Anaconda is a commercially maintained package manager designed for data science. As such they have made it quite easy to install on Windows, Mac, and Linux. Simply go to https://www.continuum.io/downloads, find your operating system, and download the appropriate Python 3.6 version installer for your operating system. Again, you want to use version 3.6, but if you end up mixing up versions or already have another version installed don’t panic, you can create a Python 3.6 environment later. 1.3.1. WINDOWS INSTALLATION NOTES Installation on Microsoft Windows is fairly straight forward, but can take quite some time. Simply follow the graphical installer, with the only thing to change is to uncheck the option to register Anaconda as the default Python installation. Though this is not as vital as with Unix based systems, it is still a good idea. After the long installation prompt, you can access an Anaconda command line via Anaconda Prompt in the Start Menu. 1.3.2. OSX INSTALLATION NOTES Installation on OSX should be quite straight forward, simply follow the installation guide of the graphical installer. 1.3.3. LINUX INSTALLATION NOTES Once the file has been downloaded, open a terminal and navigate to where the file was saved. The file installer is a bash script, which can be run by entereing :∼ $ bash Anaconda3-FILE-NAME.sh where Anaconda3-FILE-NAME.sh is the name of your file. The package will ask you to review the licence information and agree. You will then be asked if you would like to install Anaconda in another location, and you can simply install into the default location. The installer will then proceed to install Anaconda on your machine. Once
1.4. Creating your first environment 6 the installation is complete, the installer will ask "You may wish to edit your .bashrc or prepend the Anaconda3 install location:", followed by a suggested command that looks something like, export PATH=/YOUR/PATH/TO/anaconda3/bin:$PATH In order to make Anaconda work, you need to add the file path to Anaconda to a variable the operating system uses called $PATH. To do this, you can add a modified version of this line to a file called .bashrc in your home folder. Simply go to your home folder and open the file .bashrc with a text editor, and at the end of the file add the line, export PATH=$PATH:/YOUR/PATH/TO/anaconda3/bin where the /YOUR/PATH/TO/anaconda3/bin is the same one that Anaconda suggested at the end of installation. If you forgot it, it should be something like /home/YOURNAME/anaconda3/bin You may notice that you switched our path and the $PATH around. This is because you want to add our Anaconda location to end of $PATH, meaning that the operating system looks in this folder last instead of first. The insures that you don’t cause any problems with the native Python installation. 1.4. CREATING YOUR FIRST ENVIRONMENT First, you will verify that our anaconda installation is working. To do so, open a command line and simply type, :∼ $ conda You should see a nice overview of how to use the conda command. If this is not the case, either the installation didn’t work, or you might have a problem with your PATH (where the computer looks for commands). But, if it worked, you can move on to creating our first environment. you will name the environment CoursePy and you will initially only require the numpy package. In the same command line, input: :∼ $ conda create - -name CoursePy numpy You will be asked if you would like to proceed in installing a bunch of new packages, way more than numpy, and you can say yes. The reason so many new packages were listed is the magic of a package manager. The basic Python 3 with the numpy package actually depends on all these underlying dependencies, which Anaconda kindly figures
1.4. Creating your first environment 7 out for you. So now you have our nice new environment, and you can activate it by entering :∼ $ source activate CoursePy on Mac or Linux and :∼ $ activate CoursePy on Windows. You command line should now tell you that you are now in the CoursePy environment. If you now open a Python console by typing python in the command line, our version should now be 3.6.0. In this same manner, you can do things like duplicate and export our environments, or make new environments with different packages or even different Python versions. 1.4.1. INSTALLING A PACKAGE Now that you are in our nice new environment, you can add any package you might need. Open an command line and enter the CoursePy environment. Now to install the Pandas package, you simply enter, :∼ $ conda install spyder Anaconda will list all the package changes it will make, and ask if you would like to proceed. Confirm yes, then let the magic happen. Now you have the Spyder IDE, which you can use to develop code (similar concept to R Commander or the MATLAB IDE). Anaconda has some nice documentation about how to use their software, including how to search for packages not in their repositories, which we will not cover here. Now that you have our installation and environment all sorted out, you can start to explore Python itself a bit in the next chapters.
2 PYTHON DATA TYPES This chapter provides information on the basic data types in Python. It also introduces the basic operations used to access and manipulate the data 8
2.1. Basic Data Types 9 In python, there are various types of data. Every data has a type and a value. Every value has a fixed data type but it should not specified beforehand. The most basic data types in python are: 1. Boolean: These are data which have only two values: True or False. 2. Numbers: These are numeric data. 3. Strings: These data are sequences of unicode characters. 4. Bytes: An immutable sequence of numbers. Furthermore, these data types can be combined and following types of datasets can be produced: 1. Lists: Ordered sequences of values. 2. Tuples: Ordered but immutable, i.e. cannot be modified, sequences of values. 3. Sets: Unordered bags of values. 4. Dictionaries: Unordered bag of key-value pairs. 5. Arrays: Ordered sequences of data of same type mentioned above. 2.1. BASIC DATA TYPES In this section, a brief description of basic data types, their possible values, and various operations that can be applied to them are described. 2.1.1. BOOLEAN DATA These data are either True or False. If an expression can produce either yes or no answers, booleans can be used to interpret the result. This kind of yes/no situations are known as boolean context. Here is a simple example. • Assign some variable (size) as 1. In [1]: 1 size = 1
2.1. Basic Data Types 10 • Check if size is less than 0. In [2]: 1 size < 0 Out[2]: 1 False XIt is false as 1 > 0. • Check if size is greater than 0. In [3]: 1 size > 0 Out[3]: 1 True XIt is true as 1 > 0. True or False can also be treated as numbers: True=1 and False=0. 2.1.2. NUMBERS Python supports both integers and floating point numbers. There’s no type declaration to distinguish them and Python automatically distinguishes them apart by the presence or absence of a decimal point. • You can use type() function to check the type of any value or variable. In [4]: 1 type (1) Out[4]: 1 int XAs expected, 1 is an int. In [5]: 1 type (1.) Out[5]: 1 float XThe decimal at the end make 1. a float. In [6]: 1 1+1 Out[6]: 1 2 XAdding an int to an int yields an int.
2.1. Basic Data Types 11 In [7]: 1 1+1. Out[7]: 1 2.0 XAdding an int to a float yields a float. Python coerces the int into a float to perform the addition, then returns a float as the result. • Integer can be converted to float using float() and float can be converted to integer using int() In [8]: 1 float (2) Out[8]: 1 2.0 In [9]: 1 int (2.6) Out[9]: 1 2 XPython truncates the float to integer, 2.6 becomes 2 instead of 3. To round the float number use In [10]: 1 round (2.6) Out[10]: 1 3.0 NUMERICAL OPERATIONS • The / operator performs division. In [11]: 1 1/2 Out[11]: 1 0 In [12]: 1 1/2. Out[12]: 1 0.5 XBe careful on float or integer data type as the result can be different as shown above.
2.1. Basic Data Types 12 • The // operator performs a division combined with truncating and rounding. When the result is positive, it truncates the result but when the result is negative, it rounds off the result to nearest integer but the result is always a float. In [13]: 1 1.//2 Out[13]: 1 0.0 In [14]: 1 -1.//2 Out[14]: 1 -1.0 • The ‘**’ operator means “raised to the power of”. 112 is 121. In [15]: 1 11**2 Out[15]: 1 121 In [16]: 1 11**2. Out[16]: 1 121.0 XBe careful on float or integer data type as the result can be different as shown above. • The ‘%’ operator gives the remainder after performing integer division. In [17]: 1 11%2 Out[17]: 1 1 X11 divided by 2 is 5 with a remainder of 1, so the result here is 1. FRACTIONS To start using fractions, import the fractions module. To define a fraction, create a Fraction object as In [18]: 1 import fractions 2 fractions . Fraction (1 ,2) Out[18]: 1 Fraction (1 , 2)
2.1. Basic Data Types 13 You can perform all the usual mathematical operations with fractions as In [19]: 1 fractions . Fraction (1 , 2) *2 Out[19]: 1 Fraction (1 , 1) TRIGONOMETRY You can also do basic trigonometry in Python. In [20]: 1 import math 2 math . pi Out[20]: 1 3.1415926535897931 In [21]: 1 math . sin ( math . pi / 2) Out[21]: 1 1.0 2.1.3. STRINGS In Python, all strings are sequences of Unicode characters. It is an immutable sequence and cannot be modified. • To create a string, enclose it in quotes. Python strings can be defined with either single quotes (' ') or double quotes ('' ''). In [22]: 1 s='sujan ' In [23]: 1 s=" sujan " • The built-in len() function returns the length of the string, i.e. the number of characters. In [24]: 1 len (s) Out[24]: 1 5 • You can get individual characters out of a string using index notation. In [25]: 1 s [1] Out[25]: 1 u
2.2. Combined Data Types 14 • You can concatenate strings using the + operator. In [26]: 1 s+ +'koirala ' Out[26]: 1 sujan koirala XEven space has to be specified as an empty string. 2.1.4. BYTES An immutable sequence of numbers between 0 and 255 is called a bytes object. Each byte within the bytes object can be an ascii character or an encoded hexadecimal number from \x00 to \xff (0–255). • To define a bytes object, use the b' 'syntax. This is commonly known as “byte literal” syntax. In [27]: 1 by = b'abcd \ x65 ' 2 by Out[27]: 1 'abcde ' X\x65 is 'e'. • Just like strings, you can use len() function and use the + operator to concatenate bytes objects. But you cannot join strings and bytes. In [28]: 1 len ( by ) Out[28]: 1 5 In [29]: 1 by += b'\ x66 ' 2 by Out[29]: 1 'abcdef ' 2.2. COMBINED DATA TYPES The basic data types explained in the previous section can be arranged in sequences to create combined data types. These combined data types can be modified, for e.g., lists or are immutable which cannot be modified, for e.g., tuples. This section provides brief description of these data and the common operations that can be used.
2.2. Combined Data Types 15 2.2.1. LISTS Lists are the sequence of data stored in an arranged form. It can hold different types of data (strings, numbers etc.) and it can be modified to add new data or remove old data. CREATING A LIST To create a list: use square brackets “[ ]” to wrap a comma-separated list of values of any data types. In [30]: 1 a_list =[ 'a','b','mpilgrim ','z','example ', 2] 2 a_list Out[30]: 1 ['a', 'b', 'mpilgrim ', 'z', 'example ', 2] XAll data except last data are strings. Last one is integer. In [31]: 1 a_list [ 0] Out[31]: 1 'a' XList data can be accessed using index. In [32]: 1 type ( a_list [0]) Out[32]: 1 str In [33]: 1 type ( a_list [ -1]) Out[33]: 1 int XType of data can be checked using type(). SLICING A LIST Once a list has been created, a part of it can be taken as a new list. This is called slicing the list. A slice can be extracted using indices. Let’s consider same list as above: In [34]: 1 a_list =[ 'a','b','mpilgrim ','z','example ', 2] • The length of the list can be obtained as: In [35]: 1 len ( a_list ) Out[35]: 1 6
2.2. Combined Data Types 16 Xthe index can be from 0 to 5 if we count from left to right or -1 to -6 if we count from right to left. • We can obtain any other list as: In [36]: 1 b_list = a_list [0:3] 2 b_list Out[36]: 1 ['a', 'b', 'mpilgrim '] ADDING ITEM TO A LIST There are 4 different ways to add item/items to a list. Let’s consider same list as above: In [37]: 1 a_list =[ 'a','b','mpilgrim ','z','example ', 2] 1. ‘+’ operator: The + operator concatenates lists to create a new list. A list can contain any number of items; there is no size limit. In [38]: 1 b_list = a_list +[ 'Hydro ','Aqua '] 2 b_list Out[38]: 1 ['a','b','mpilgrim ','z','example ', 2,'Hydro ','Aqua '] 2. append(): The append() method adds a single item to the end of the list. Even if the added item is a list, the whole list is added as a single item in the old list. In [39]: 1 b_list . append ( True ) 2 b_list Out[39]: 1 ['a','b','mpilgrim ','z','example ', 2,'Hydro ','Aqua ', True ] XThis list has strings, integer, and boolean data. In [40]: 1 len ( b_list ) Out[40]: 1 9 In [41]: 1 b_list . append ([ 'd','e']) 2 b_list
2.2. Combined Data Types 17 Out[41]: 1 ['a','b','mpilgrim ','z','example ', 2,'Hydro ','Aqua ',True ,[ 'd',' e'] ] In [42]: 1 len ( b_list ) Out[42]: 1 10 XThe length of b_list has increased by only one even though two items, ['d', 'e'], were added. 3. extend(): Similar to append but each item is added separately. For e.g., let’s consider the list In [43]: 1 b_list =[ 'a','b','mpilgrim ','z','example ', 2,'Hydro ','Aqua ', True ] 2 len ( b_list ) Out[43]: 1 9 In [44]: 1 b_list . extend ([ 'd','e']) 2 b_list Out[44]: 1 ['a','b','mpilgrim ','z','example ', 2,'Hydro ','Aqua ',True ,'d','e '] In [45]: 1 len ( b_list ) Out[45]: 1 11 XThe length of b_list has increased by two as two items in the list, ['d', 'e'], were added. 4. insert(): The insert() method inserts a single item into a list. The first argument is the index of the first item in the list that will get bumped out of position. In [46]: 1 b_list =[ 'a','b','mpilgrim ','z','example ', 2,'Hydro ','Aqua ', True ] 2 b_list . insert (0 , 'd') XInsert 'd' in the first position,i.e., index 0. In [47]: 1 b_list
2.2. Combined Data Types 18 Out[47]: 1 ['d','a','b','mpilgrim ','z','example ', 2,'Hydro ','Aqua ', True ] In [48]: 1 b_list . insert (0 ,[ 'x','y']) In [49]: 1 b_list Out[49]: 1 [[ 'x','y'],'d','a','b','mpilgrim ','z','example ', 2,'Hydro ',' Aqua ', True ] XThe list ['x', 'y'] is added as one item as in the case of append(). SEARCH FOR ITEM IN A LIST Consider the following list: In [50]: 1 b_list =[ 'a','b','mpilgrim ','z','example ', 2,'Hydro ','Aqua ','b'] • count() can be used as in the case of string In [51]: 1 b_list . count ('b') Out[51]: 1 2 • in can be used to check if certain value exists in a list. In [52]: 1 'b'in b_list Out[52]: 1 True In [53]: 1 'c'in b_list Out[53]: 1 False XThe output is boolean data, i.e., True or False. • index can be used to find the index of search data. In [54]: 1 b_list . index ('a') Out[54]: 1 0 In [55]: 1 b_list . index ('b') Out[55]: 1 1 XEven though there are 2 'b', the index of first 'b' is returned.
2.2. Combined Data Types 19 REMOVING ITEM FROM A LIST There are many ways to remove an item from a list. The list automatically adjusts its size after some element has been removed. REMOVING ITEM BY INDEX The del command removes an item from a list if the index of an element that needs to be removed is provided. • Consider the following list: In [56]: 1 b_list =[ 'a','b','mpilgrim ','z','example ', 2,'Hydro ','Aqua ','b'] • Suppose we want to remove the element 'mpilgrim' from the list. Its index is 2. In [57]: 1 b_list [2] Out[57]: 1 'mpilgrim ' In [58]: 1 del b_list [2] 2 b_list Out[58]: 1 ['a','b','z','example ', 2,'Hydro ','Aqua ','b'] X'mpilgrim' is now removed. The pop() command can also remove an item by specifying an index. But, it is even more versatile as it can be used without any argument to remove the last item of a list. • Consider the following list: In [59]: 1 b_list =[ 'a','b','mpilgrim ','z','example ', 2,'Hydro ','Aqua ','b'] • Suppose we want to remove the element 'mpilgrim' from the list. Its index is 2. In [60]: 1 b_list [2] Out[60]: 1 'mpilgrim ' In [61]: 1 b_list . pop (2)
2.2. Combined Data Types 20 Out[61]: 1 'mpilgrim ' XThe item to be removed will be displayed. • Now the b_list is as follows Out[61]: 1 ['a','b','z','example ', 2,'Hydro ','Aqua ','b'] • If pop() is used without an argument. In [62]: 1 b_list . pop () • Now the b_list is as follows Out[62]: 1 ['a','b','z','example ', 2,'Hydro ','Aqua '] XThe last 'b' is removed from the list. • If pop() is used once again. The list will be as follows: Out[62]: 1 ['a','b','z','example ', 2,'Hydro '] REMOVING ITEM BY VALUE The remove command removes item/items from a list if the value of the item is specified. • Consider the following list: In [63]: 1 b_list =[ 'a','b','mpilgrim ','z','example ', 2,'Hydro ','Aqua ','b'] • Suppose we want to remove the elements 'b' from the list. In [64]: 1 b_list . remove ('b') In [65]: 1 b_list Out[65]: 1 ['a','z','example ', 2,'Hydro ','Aqua '] XAll the 'b' in the list are now removed.
2.2. Combined Data Types 21 2.2.2. TUPLES A tuple is an immutable list. A tuple can not be changed/modified in any way once it is created. • A tuple is defined in the same way as a list, except that the whole set of elements is enclosed in parentheses instead of square brackets. • The elements of a tuple have a defined order, just like a list. Tuples indices are zero based, just like a list, so the first element of a non empty tuple is always t[0]. • Negative indices count from the end of the tuple, just as with a list. • Slicing works too, just like a list. Note that when you slice a list, you get a new list; when you slice a tuple, you get a new tuple. • A tuple is used because reading/writing a tuple is faster than the same for lists. If you do not need to modify a set of item, a tuple can be used instead of list. CREATING TUPLES A tuple can be created just like the list but parentheses “( )” has to be used instead of square brackets“[ ]”. For e.g., In [66]: 1 a_tuple =( 'a','b','mpilgrim ','z','example ', 2,'Hydro ','Aqua ','b') TUPLE OPERATIONS All the list operations except the ones that modify the list itself can be used for tuples too. For e.g., you cannot use append(), extend(), insert(), del, remove(), and pop() for tuples. For other operations, please follow the same steps as explained in the previous section. Here are some examples of tuple operations. • Consider the following tuple: In [67]: 1 a_tuple =( 'a','b','mpilgrim ','z','example ', 2,'Hydro ','Aqua ','b' ) In [68]: 1 a_tuple . index ('z') Out[68]: 1 3
2.2. Combined Data Types 22 Xitem 'z' is at the index 3, i.e., it is the fourth element of the tuple. In [69]: 1 b_tuple = a_tuple [0:4] In [70]: 1 b_tuple Out[70]: 1 ('a','b','mpilgrim ','z') XNew tuple can be created by slicing a tuple as original tuple does not change. In [71]: 1 a_tuple Out[71]: 1 ('a','b','mpilgrim ','z','example ', 2,'Hydro ','Aqua ','b') 2.2.3. SETS A set is an unordered collection of unique values. A single set can contain values of any datatype. CREATING SET There are basically two ways of creating set. 1. From scratch: Sets can be created like lists but curly brackets “{}” have to be used instead of square brackets “[ ]”. For e.g., In [72]: 1 a_set ={ 'a','b','mpilgrim ','z','example ', 2,'Hydro ','Aqua ','b'} In [73]: 1 type ( a_set ) Out[73]: 1 set In [74]: 1 a_set Out[74]: 1 {2 , 'Aqua ', 'Hydro ', 'a', 'b', 'example ', 'mpilgrim ', 'z'} XThe set has different orders than the values given inside {} because it is unordered and original orders are ignored. Also, there is only one 'b' in the set even though two 'b' were given because a set is a collection of unique values. Duplicate values are taken as one.
2.2. Combined Data Types 23 2. From list or tuple: A set can be created from a list or tuple as, In [75]: 1 set ( a_list ) 2 set ( a_tuple ) MODIFYING SET A set can be modified by adding an item or another set to it. Also, items of set can be removed. ADDING ELEMENTS • Consider a set as follows, In [76]: 1 a_set ={2 , 'Aqua ','Hydro ','a','b','example ','mpilgrim ','z'} • Using add: To add single item to a set. In [77]: 1 a_set . add ('c') In [78]: 1 a_set Out[78]: 1 {2 , 'Aqua ','Hydro ','a','b','c','example ','mpilgrim ','z'} X'c' is added after 'b'. • Using update: To add multiple items as a set or list or tuple. In [79]: 1 a_set . update ('a','Sujan ','Koirala ') In [80]: 1 a_set Out[80]: 1 {2 , 'Aqua ','Hydro ','Koirala ','Sujan ','a','b','c','example ',' mpilgrim ','z'} X'Koirala' and 'Sujan' are added but 'a' is not added. REMOVING ELEMENTS • Consider a set as follows, In [81]: 1 a_set ={2 , 'Aqua ','Hydro ','a','b','example ','mpilgrim ','z'}
2.2. Combined Data Types 24 • Using remove() and discard(): These are used to remove an item from a set. In [82]: 1 a_set . remove ('b') In [83]: 1 a_set Out[83]: 1 {2 , 'Aqua ','Hydro ','Koirala ','Sujan ','a','c','example ','mpilgrim ','z'} X'b' has been removed. In [84]: 1 a_set . discard ('Hydro ') In [85]: 1 a_set Out[85]: 1 {2 , 'Aqua ','Koirala ','Sujan ','a','c','example ','mpilgrim ','z'} • Using pop() and clear(): pop() is same as list but it does not remove the last item as list. pop() removes one item ramdomly. clear() is used to clear the whole set and create an empty set. In [86]: 1 a_set . pop () In [87]: 1 a_set Out[87]: 1 {2 , 'Koirala ','Sujan ','a','c','example ','mpilgrim ','z'} SET OPERATIONS Two sets can be combined or common elements in two sets can be combined to form a new set. These functions are useful to combine two or more lists. • Consider following two sets, In [88]: 1 a_set ={2 ,4 ,5 ,9 ,12 ,21 ,30 ,51 ,76 ,127 ,195} 2 b_set ={1 ,2 ,3 ,5 ,6 ,8 ,9 ,12 ,15 ,17 ,18 ,21} • Union: Can be used to combine two sets. In [89]: 1 c_set = a_set . union ( b_set ) In [90]: 1 c_set Out[90]: 1 {1 ,2 ,195 ,4 ,5 ,6 ,8 ,12 ,76 ,15 ,17 ,18 ,3 ,21 ,30 ,51 ,9 ,127}
2.2. Combined Data Types 25 • Intersection: Can be used to create a set with elements common to two sets. In [91]: 1 d_set = a_set . intersection ( b_set ) In [92]: 1 d_set Out[92]: 1 {9 ,2 ,12 ,5 ,21} 2.2.4. DICTIONARIES A dictionary is an unordered set of key-value pairs. A value can be retrieved for a known key but the other-way is not possible. CREATING DICTIONARY Creating a dictionary is similar to set in using curled brackets “©ª” but key:value pairs are used instead of values. The following is an example, In [93]: 1 a_dict ={ 'Hydro ':' 131.112.42.40 ','Aqua ':' 131.112.42.41 '} In [94]: 1 a_dict Out[94]: 1 {'Aqua ':' 192.168.1.154 ','Hydro ':' 131.112.42.40 ' XThe order is changed automatically like set. In [95]: 1 a_dict ['Hydro '] Out[95]: 1 ' 131.112.42.40 ' XKey 'Hydro' can be used to access the value '131.112.42.40'. MODIFYING DICTIONARY Since the size of the dictionary is not fixed, new key:value pair can be freely added to the dictionary. Also values for a key can be modified. • Consider the following dictionary. In [96]: 1 a_dict ={ 'Aqua ':' 192.168.1.154 ','Hydro ':' 131.112.42.40 '} • If you want to change the value of 'Aqua', In [97]: 1 a_dict ['Aqua ']= ' 192.168.1.154 '
2.2. Combined Data Types 26 In [98]: 1 a_dict Out[98]: 1 {'Aqua ':' 192.168.1.154 ','Hydro ':' 131.112.42.40 ' • If you want to add new item to the dictionary, In [99]: 1 a_dict ['Lab ']= 'Kanae ' In [100]: 1 a_dict Out[100]: 1 {'Aqua ':' 192.168.1.154 ','Hydro ':' 131.112.42.40 ','Lab ':'Kanae '} • Dictionary values can also be lists instead of single values. For e.g., In [101]: 1 k_lab ={ 'Female ':[ 'Yoshikawa ','Imada ','Yamada ','Sasaki ',' Watanabe ','Sato '],'Male ':[ 'Sujan ','Iseri ','Hagiwara ',' Shiraha ','Ishida ','Kusuhara ','Hirochi ','Endo ']} In [102]: 1 k_lab ['Female '] Out[102]: 1 ['Yoshikawa ','Imada ','Yamada ','Sasaki ','Watanabe ','Sato '] 2.2.5. ARRAYS Arrays are similar to lists but it contains homogeneous data, i.e., data of same type only. Arrays are commonly used to store numbers and hence used in mathematical calculations. CREATING ARRAYS Python arrays can be created in many ways. It can also be read from some data file in text or binary format, which are explained in latter chapters of this guide. Here, some commonly used methods are explained. For a detailed tutorial on python arrays, refer here. 1. From list: Arrays can be created from a list or a tuple using: Xsomearray=array(somelist). Consider the following examples. In [103]: 1 b_list =[ 'a','b' ,1 ,2]
2.2. Combined Data Types 27 XThe list has mixed datatypes. First two items are strings and last two are numbers. In [104]: 1 b_array = array ( b_list ) 1 array ([ 'a','b','1','2'], dtype ='|S8 ') XSince first two elements are string, numbers are also converted to strings when array is created. In [105]: 1 b_list2 =[1 ,2 ,3 ,4] XAll items are numbers. In [106]: 1 b_array2 = array ( b_list2 ) In [107]: 1 b_array2 Out[107]: 1 array ([1 , 2, 3, 4]) XNumeric array is created. Mathematical operations like addition, subtraction, division, etc. can be carried in this array. 2. Using built-in functions: (a) From direct values: In [108]: 1 xx = array ([2 , 4, -11]) Xxx is array of length 3 or shape (1,3) ⇒ means 1 row and 3 columns. (b) From arange(number): Creates an array from the range of values. Examples are provided below. For details of arange follow chapter 4. In [109]: 1 yy = arange (2 ,5 ,1) In [110]: 1 yy Out[110]: 1 array ([2 ,3 ,4]) XCreates an array from lower value (2) to upper value (5) in specified interval (1) excluding the last value (5). In [111]: 1 yy = arange (5)
2.2. Combined Data Types 28 In [112]: 1 yy Out[112]: 1 array ([0 ,1 ,2 ,3 ,4]) XIf the lower value and interval are not specified, they are taken as 0 and 1, respectively. In [113]: 1 yy = arange (5 ,2 , -1) In [114]: 1 yy Out[114]: 1 array ([5 ,4 ,3]) XThe interval can be negative. (c) Arrays of fixed shape: Sometimes it is necessary to create some array to store the result of calculation. Fuctions zeros(someshape) and ones(someshape) can be used to create arrays with all values as zero or one, respectively. In [115]: 1 zz = zeros (20) Xwill create an array with 20 zeros. In [116]: 1 zz = zeros (20 ,20) Xwill create an array with 20 rows and 20 columns (total 20*20=400 elements) with all elements as zero. In [117]: 1 zz = zeros (20 ,20 ,20) Xwill create an array with 20 blocks with each block having 20 rows and 20 columns (total 20*20*20=8000 elements) with all elements as zero. ARRAY OPERATIONS Arithmetic operators on arrays apply elementwise. A new array is created and filled with the result. In [118]: 1 a = array ([20 ,30 ,40 ,50]) 2 b = arange (4) In [119]: 1 b Out[119]: 1 array ([0 ,1 ,2 ,3])
2.2. Combined Data Types 29 In [120]: 1 c = a -b In [121]: 1 c Out[121]: 1 array ([20 , 29 , 38 , 47]) XEach element in b is subtracted from respective element in a. In [122]: 1 b **2 Out[122]: 1 array ([0 , 1, 4, 9]) XSquare of each element in b. In [123]: 1 10* sin (a) Out[123]: 1 array ([ 9.12945251 , -9.88031624 , 7.4511316 , -2.62374854]) XTwo operations can be carried out at once. In [124]: 1 a <35 Out[124]: 1 array ([ True , True , False , False ], dtype = bool ) X'True' if a<35 otherwise 'False'.
3 INPUT/OUTPUT OF FILES Read and write data from/to files in commonly used data formats, such as text (csv), binary, excel, netCDF, R data frame and Matlab. 30
3.1. Read Text File 31 This chapter explains the method to read and write data from/to commonly used data formats, such as text (csv), binary, excel, netCDF, and Matlab. 3.1. READ TEXT FILE Small datasets are often stored in a structured or unstructured text format. Python libraries are able to read these data files in several ways. 3.1.1. PLAIN TEXT First, we will load data from a free format structured text file (e.g., ASCII data). In [125]: 1 a= loadtxt (' example_plain . txt ', comments ='#', delimiter = None , converters = None , skiprows =0 , usecols = None ) Reads the data in the text file as an ’array’. Will raise an error if the data is non-numeric (float or integer). 3.1.2. COMMA SEPARATED TEXT Often the data values in text file are separated by special characters like tab, line breaks, or a comma. These separators can be excluded when reading the file by using the option ’delimiter’ while using loadtxt. In [126]: 1 a= loadtxt (' example_csv . csv ', delimiter =',', converters ={0: datestr2num }) A full list of options of loadtxt is available here. 3.1.3. UNSTRUCTURED TEXT If the text data is unstructured, extra work is needed to read the file, and to process it to be saved as an array. • First the file has to be opened as an object : In [127]: 1 a= file ('filename ') In [128]: 1 a= open ('filename ') In [129]: 1 type (a)
3.2. Save Text File 32 Out[129]: 1 file • Extracting data from the file as a ’list’: In [130]: 1 a_list =a . readlines () Xreadlines() reads contents (each line) of the file object ’a’ and puts it in a a_list. • Extracting data from the file as a ’string’: In [131]: 1 a_str =a. read () Xread() reads contents of the file object ’a’ and stores it as a string. ASCII files are coded with special characters. These characters need to be removed from each line/item of the data using read or readlines. • Drop the ’\n’ or ’\r \n’ sign at the end of each line: • strip() is used to remove these characters: • To drop it from each element of a_list: In [132]: 1 b =[ s. strip () for s in a] • Furthermore, to convert each element into float: In [133]: 1 b =[ float (s. strip () ) for s in a] 3.2. SAVE TEXT FILE • To save an array ’a’, In [134]: 1 savetxt ( filename , a , fmt ='%.18 e', delimiter =' ', newline ='\n', header ='', footer ='', comments ='# ') A full list of options of savetxt is available here.
3.3. Read Binary Data 33 Table 3.1: Data type of the returned array Type code C Type Python Type Minimum size in bytes 'c' char character 1 'b' signed char int 1 'B' unsigned char int 1 'u' Py_UNICODE Unicode character 2 'h' signed short int 2 'H' unsigned short int 2 'i' signed int int 2 'I' unsigned int long 2 'l' signed long int 4 'L' unsigned long long 4 'f' float float 4 'd' double float 8 ãã 3.3. READ BINARY DATA Binary data format is used because it uses smaller number of bytes to store each data, such that its efficient in using smaller memory. This section explains the procedure of reading and writing data in binary format using built-in function, fromfile. In [135]: 1 dat = fromfile ('filename ,'type code ') • filename is the name of the file. • type code: can be defined as type code (e.g., 'f') or python type (e.g., 'float') as shown in Table 3.1. It determines the size and byte-order of items in the binary file. In [136]: 1 dat = fromfile (' example_binary . float32 ','f') 2 dat
3.4. Write Binary Data 34 3.4. WRITE BINARY DATA • To write/save all items (as machine values) of an array "A" to a file: In [137]: 1 A. tofile ('filename ') • can also include the data type as, In [138]: 1 A. astype ('f'). tofile ('filename ') 3.5. READ NETCDF DATA NetCDF data files can be read by several packages such as Scientific, Scipy, and NetCDF4. Below is an example of reading netCDF file using io module of Scipy. In [139]: 1 from scipy . io import netcdf 2 ncf = netcdf . netcdf_file (' example_netCDF .nc ') 3 ncfile . variables 4 dat = ncf . variables [' wbal_clim_CUM '][:] 3.6. WRITE NETCDF DATA A short example of how to create netCDF data is below. For details, refer to the original Scipy help page. In [140]: 1 from scipy . io import netcdf 2 f = netcdf . netcdf_file ('simple .nc ', 'w') 3 f. history = 'Created for a test ' 4 f. createDimension ('time ', 10) 5 time = f. createVariable ('time ', 'i', ('time ' ,)) 6 time [:] = np . arange (10) 7 time . units = 'days since 2008 -01 -01 ' 8 f. close () 3.7. READ MATLAB DATA MatLab data files can be read by using python interface for hdf5 dataset. Requires installation of h5py package. In [141]: 1 a= h5py . File (' example_matlab . mat ') In [142]: 1 a. keys ()
3.8. Read Excel Data 35 Out[142]: 1 [u'# refs #', u'Results '] In [143]: 1 a['Results ']. keys () Out[143]: 1 [u'SimpBM ', u'SimpBM2L ', u'SimpBMtH ', u' SimpGWoneTfC ', u'SimpGWvD '] In [144]: 1 dat =a['Results / SimpGWvD / Default / ModelOutput / actET '][:] 2 dat =a['Results '][ 'SimpGWvD '][ 'Default '][ ' ModelOutput '][ 'actET '][:] 3.8. READ EXCEL DATA Excel workbooks created by MS Office 2010 or later (.xlsx) file can be read using openpyxl package. In [145]: 1 ex_f = load_workbook (' example_xls . xlsx ') 2 ex_f . sheetnames 3 a_sheet = ex_f ['Belleville_96 -pr ']
4 DATA OPERATIONS IN PYTHON I nformation on common mathematical and simple statistical operation on data. 36
4.1. Size and Shape 37 4.1. SIZE AND SHAPE All data are characterized by two things: how big they are (size), and how they are arranged (shape). Here are some useful commands to play with the size and shape of data. We will use the following list as an example: In [146]: 1 a =[1 ,2 ,3 ,4 ,5 ,6 ,7 ,8 ,9 ,10] • Check the dimension of the data: In [147]: 1 size (a) Out[147]: 1 10 • Check the shape of the data: In [148]: 1 shape (a) Out[148]: 1 (10 ,) Xfor list. In [149]: 1 array (a). shape Out[149]: 1 (10 ,) Xfor array. XNote that the order is number of rows (longitudinal direction↓), number of columns (lateral direction→) for 2-dimensional arrays in python. • Change the arrangement of the data: In [150]: 1 b=a . reshape (2 ,5) Xcan be used in arrays only. In [151]: 1 b= array ([[1 ,2 ,3 ,4 ,5] ,[6 ,7 ,8 ,9 ,10]]) In [152]: 1 b= reshape (a ,(2 ,5) )
4.2. Slicing and Dicing 38 Xcan be used for both array and list. List is converted to array by using this function. In [153]: 1 b=a . reshape ( -1 ,5) XBy using the ‘-1’ flag, the first dimension is automatically set to match the total size of the array. For e.g., if there are 10 elements in an array/list and 5 columns is specified during reshape, number of rows is automatically calculated as 2. The shape will be (2,5). • Convert the data type to array: In [154]: 1 b= array ( a) • Convert the data type to list: In [155]: 1 b=a . tolist () • Convert the data into float and integer: In [156]: 1 float (a [0]) 2 int (a [0]) Xthese functions can only be used for one element at a time. 4.2. SLICING AND DICING This section explains how to extract data from an array or a list. The following process can be used to take data for a region from global data, or for a limited period from long time series data. The process is called ‘slicing’. As same method can be used for arrays and lists. Let’s consider the following list, In [157]: 1 a =[1 ,2 ,3 ,4 ,5] XThere are five items in the list. INDEX BASICS Indexing is done in two ways: 1. Positive Index: The counting order is from left to right. The index for the first element is 0 (not 1).
4.2. Slicing and Dicing 39 In [158]: 1 a [0] Out[158]: 1 1 In [159]: 1 a [1] Out[159]: 1 2 In [160]: 1 a [4] Out[160]: 1 5 XThe fifth item (index=4) is 5. 2. Negative Index: The counting order is from right to left. The index for the last item is -1. In some cases, the list is very long and it is much easier to count from the end rather than the beginning. In [161]: 1 a [ -1] Out[161]: 1 5 XIt is same as a[4] as shown above. In [162]: 1 a [ -2] Out[162]: 1 4 DATA EXTRACTION Data extraction is carried out by using indices. In this section, some examples of using indices are provided. Details of array indexing and slicing can be found here. 1. Using two indices: In [163]: 1 somelist [ first index : last index :( interval )] In [164]: 1 a [0:2] Out[164]: 1 [1 ,2] Xa[0] and a[1] are included but a[2] is not included.
4.2. Slicing and Dicing 40 In [165]: 1 a [3:4] Out[165]: 1 4 2. Using single index: In [166]: 1 a [:2] Out[166]: 1 [1 ,2] Xsame as a[0:2]. In [167]: 1 a [2:] Out[167]: 1 [3 ,4 ,5] Xsame as a[2:5]. 3. Consider a 2-D list and 2-D array Different method for array and list as indexing is different in two cases as explained below. In [168]: 1 a_list =[[1 ,2 ,3] ,[4 ,5 ,6]] 2 a_array = array ([[1 ,2 ,3] ,[4 ,5 ,6]]) In [169]: 1 shape ( a_list ) Out[169]: 1 (2 ,3) In [170]: 1 a_array . shape Out[170]: 1 (2 ,3) In [171]: 1 a_list [0] Out[171]: 1 [1 ,2 ,3] Xwhich is a list. In [172]: 1 a_array [0] Out[172]: 1 array ([1 ,2 ,3]) Xwhich is an array.
4.2. Slicing and Dicing 41 4. To extract data from list, In [173]: 1 a_list [0][1] Out[173]: 1 2 In [174]: 1 a_list [1][:2] Out[174]: 1 [4 ,5] XThe index has to be provided in two different sets of square brackets “[ ]”. 5. To extract data from array, In [175]: 1 a_array [0 ,1] Out[175]: 1 2 In [176]: 1 a_array [1 ,:2] Out[176]: 1 [4 ,5] XThe index can be provided is one set of square brackets “[ ]”. 6. Consider a 3-D list and 3-D array, In [177]: 1 a_list =[[[2 ,3] ,[4 ,5] ,[6 ,7] ,[8 ,9]] ,[[12 , 13] ,[14 ,15] ,[16 ,17] ,[18 ,19]]] 2 a_array = array ([[[2 ,3] ,[4 ,5] ,[6 ,7] ,[8 ,9]] ,[[12 , 13] ,[14 ,15] ,[16 ,17] ,[18 ,19]]]) XThe shape of both data is (2,4,2). To extract from list, In [178]: 1 a_list [0][2] Out6,7: In [179]: 1 a_list [0][2][1] Out[179]: 1 6 To extract from array,
4.3. Built-in Mathematical Functions 42 In [180]: 1 a_array [0 ,2] Out[180]: 1 array ([6 ,7]) In [181]: 1 a_array [0 ,2 ,1] Out[181]: 1 6 4.3. BUILT-IN MATHEMATICAL FUNCTIONS The Python interpreter has a number of functions built into it. This section documents the Pythonâs built-in functions in easy-to-use order. Firstly, consider the following 2-D arrays, In [182]: 1 A= array ([[ -2 , 2] , [ -5 , 5]]) 2 B= array ([[2 , 2] , [5 , 5]]) 3 C= array ([[2.53 , 2.5556] , [5.3678 , 5.4568]]) 1. max(iterable): Returns the maximum from the passed elements or if a single iterable is passed, the max element in the iterable. With two or more arguments, return the largest value. In [183]: 1 max ([0 ,10 ,15 ,30 ,100 , -5]) Out[183]: 1 100 In [184]: 1 A. max () Out[184]: 1 5 2. min(iterable): Returns the minimum from the passed elements or if a single iterable is passed, the minimum element in the iterable. With two or more arguments, return the smallest value. In [185]: 1 min ([0 ,10 ,15 ,30 ,100 , -5]) Out[185]: 1 -5 In [186]: 1 A. min () Out[186]: 1 -5
4.3. Built-in Mathematical Functions 43 3. mean(iterable): Returns the average of the array elements. The average is taken over the flattened array by default, otherwise over the specified axis. For details, click here. In [187]: 1 mean ([0 ,10 ,15 ,30 ,100 , -5]) Out[187]: 1 75 In [188]: 1 A. mean () Out[188]: 1 0.0 4. median(iterable): Returns the median of the array elements. In [189]: 1 median ([0 ,10 ,15 ,30 ,100 , -5]) Out[189]: 1 12.5 In [190]: 1 A. median () Out[190]: 1 0.0 5. sum(iterable): Returns the sum of the array elements. It returns sum of array elements over an axis if axis is specified else sum of all elements. For details, click here. In [191]: 1 sum ([1 ,2 ,3 ,4]) Out[191]: 1 10 In [192]: 1 A. sum () Out[192]: 1 0 6. abs(A): Returns the absolute value of a number, which can be an integer or a float, or an entire array. In [193]: 1 abs (A) Out[193]: 1 array ([[2 ,2] ,[5 ,5]])
4.3. Built-in Mathematical Functions 44 In [194]: 1 abs (B) Out[194]: 1 array ([2 , 2] ,[5 , 5]) 7. divmod(x,y): Returns the quotient and remainder resulting from dividing the first argument (some number x or an array) by the second (some number y or an array). In [195]: 1 divmod (2 , 3) Out[195]: 1 (0 , 2) Xas 2 / 3 = 0 and remainder is 2. In [196]: 1 divmod (4 , 2) Out[196]: 1 (2 , 0) Xas 4 / 2 = 2 and remainder is 0. In case of two dimensional array data In [197]: 1 divmod (A ,B) Out[197]: 1 ( array ([[ -1 , 1] , [ -1 , 1]]) , array ([[0 , 0] , [0 , 0]]) ) 8. modulo (x%y): Returns the remainder of a division of x by y. In [198]: 1 5%2 Out[198]: 1 1 9. pow(x,y[, z]): Returns x to the power y. But, if z is present, returns x to the power y modulo z (more efficient than pow(x, y) % z). The pow(x, y) is equivalent to x**y. In [199]: 1 pow (A , B) Out[199]: 1 array ([[4 , 4] , [ -3125 , 3125]])