* Understand how Classes (new concept to acquire) work
* Create our first Class
* Create our first Class
* Working with objects
* Working with objects
* Learn to install libraries
* Learn to install libraries
...
@@ -36,15 +36,15 @@
...
@@ -36,15 +36,15 @@
## Introduction
## Introduction
All the python programs that you have developed so far use the **classical paradigm** of the [procedural programming](https://en.wikipedia.org/wiki/Procedural_programming). The main idea is to use **functions** that can be called at any point during the program's execution
All the python programs that you have developed so far use the **classical paradigm** of the [procedural programming](https://en.wikipedia.org/wiki/Procedural_programming). The main idea is to use **functions** that can be called at any point during the program's execution.
The problem you have to solve is divided into smaller parts, implemented using **functions**, **data structures** and **variables**
The problem to be solved is divided into smaller parts, implemented using **functions**, **data structures** and **variables**.
On the contrary, the [object oriented paradigm](https://en.wikipedia.org/wiki/Object-oriented_programming) tries to model the problems by means of defining **objects** that interact between them. Each object has **attributes** and **methods**. The main advantages are the **encapsulation** and the **reusability** of the code
On the contrary, the [object oriented paradigm](https://en.wikipedia.org/wiki/Object-oriented_programming) tries to model the problems by means of defining **objects** that interact between them. Each object has **attributes** and **methods**. The main advantages are the **encapsulation** and the **reusability** of the code
## Working with sequences and functions
## Working with sequences and functions
In the previous session we developed our own **library of functions** (Seq0) for working with sequences. As an example let's use a modified version of that library (some input parameters are different) for performing some **simple calculations** on the sequence "ATTCCCGGGG". Save this example in the folder **S6** and call it **test-01.py**
In the previous session we developed our own **library of functions** (Seq0) for working with sequences. As an example let's use a modified version of that library (some input parameters are different) to perfor some simple calculations on the sequence "ATTCCCGGGG". Save this example in the folder **S06** and call it **test-01.py**.
```python3
```python3
from Seq0 import *
from Seq0 import *
...
@@ -84,7 +84,7 @@ Seq: ATTCCCGGGG
...
@@ -84,7 +84,7 @@ Seq: ATTCCCGGGG
G: 4
G: 4
```
```
This **paradigm** we are following is based on defining the data (variables) on one hand, and creating **separated functions** for working with that data. When calling the function you should pass the data as **parameters**. Data and function are **separated**
This **paradigm** we are following is based on defining the data (variables) on one hand, and creating **functions** for working with that data. When calling the function you should pass the data as **parameters**. Data and function are thus **separated**.
Imagine now that we define a new sequence, but we make a mistake:
Imagine now that we define a new sequence, but we make a mistake:
...
@@ -118,27 +118,27 @@ seq_check(seq1)
...
@@ -118,27 +118,27 @@ seq_check(seq1)
...
...
```
```
This solution is ok, but what if there are programmers that do not call this function for checking? It is **NOT possible** for you to **assure** that it is going to work in all cases. It depends on the people making use of it. Some may call the seq_check() function, but it is not guaranteed
This solution is ok, but what if there are programmers that do not call this function for checking? It is **NOT possible** for you to **make sure** that it is going to work in all cases. It depends on the people making use of it. Some may call the seq_check() function, but it is not guaranteed.
But is it possible to have a **better model** for organizing both the data and the functions?
But is it possible to have a **better model** for organizing both the data and the functions?
## Modelling the sequences with object oriented programming
## Modelling the sequences with object oriented programming
Yes! There are better models. One is **Object Oriented Programming**
Yes! There are better models. One is **Object Oriented Programming**.
In this model, the **data** and the **functions** are grouped together into the so called **object**. They are no longer separated. You work with **objects.** Every object has **well defined actions** that it may perform. These actions are called **methods**
In this model, **data** and **functions** are grouped together into the so called **objects**. They are no longer separated. We work with **objects**. And every object has **well defined actions** that it may perform. These actions are called **methods**.
In order to learn about this new paradigm, let's model the DNA sequences with it
In order to learn about this new paradigm, let's model the DNA sequences with it.
We will think about the **sequences** as **objects**. These objects can have some **properties**, like their name, the chromosome to which they belong or any other information. We also refer to this properties as the object **attributes**
We will think about the **sequences** as **objects**. These objects can have some **properties**, like their name, the chromosome to which they belong, or any other relevant information. We refer to this properties as the object **attributes**.
These object also have some **methods**: different actions that can be performed on them, such as calculating their length, the number of a certain bases, their complement, and so on
These objects may have some **methods**: different actions that can be performed on them, such as calculating their length, the number of a certain bases, their complement, and so on.
We will learn this model by defining **sequence objects** from **scratch**
We will learn this model by defining **sequence objects** from **scratch**.
### Classes
### Classes
A [class](https://en.wikipedia.org/wiki/Class\_(computer_programming)) is the **template** we use for **creating objects**(an object is going to be an instance of that class). Within the class we define and implement all the **methods** that the objects of that class will have. Let's create a very simple class for working with sequences. We start by defining an **empty class**:
A [class](https://en.wikipedia.org/wiki/Class\_(computer_programming)) is the **template** we use for **creating objects**, so an object is going to be **an instance** of that class. Within the class we define and implement all the **methods** that the objects of that class will have. Let's create a very simple class for working with sequences. We start by defining an **empty class**:
Press the **step over** option twice. On the variable panel we will see the **two new objects** created: **s1** and **s2**. Notice that they are of **type Seq**
Press the **step over** option twice. On the variable panel we will see the **two new objects** created: **s1** and **s2**. Notice that they both are of **type Seq**:
@@ -169,7 +169,7 @@ Congrats! You have created your first two **empty objects**!
...
@@ -169,7 +169,7 @@ Congrats! You have created your first two **empty objects**!
### The \__init_\_ method
### The \__init_\_ method
The **methods** are the **actions** that the objects can perform. The first method we are implementing is the **initialization method**. It is a **special method** that is called every time a new object is created. All the methods have the **special parameter self** as the first parameter
The **methods** are the **actions** that the objects can perform. The first method we are implementing is the **initialization method**. It is a **special method** that is called every time a new object is created. All the methods have the **special parameter self** as the first parameter.
```python
```python
classSeq:
classSeq:
...
@@ -185,7 +185,7 @@ s2 = Seq()
...
@@ -185,7 +185,7 @@ s2 = Seq()
print("Testing...")
print("Testing...")
```
```
Run the program. When the s1 object is created, the string "New sequence created!" is printed. The same happens when the s2 object is also created. The **output** of the program is:
Run the program. When the s1 object is created, the string "New sequence created!" is printed. The same happens when the s2 object is created. The **output** of the program is:
```
```
New sequence created!
New sequence created!
...
@@ -198,9 +198,9 @@ Testing....
...
@@ -198,9 +198,9 @@ Testing....
### Adding data: attribute strbases
### Adding data: attribute strbases
For representing a **sequence** we will use a **string** that will be stored in every object. We will call this string **strbases**. Data stored in the objects are referred as**attributes**
For representing a **sequence** we will use a **string** that will be stored in every object. We will call this string **strbases**. As mentioned earlier, data stored in the objects are called**attributes**, as they describe the objetcts.
Let's modify the \__init_\_ method to include a new parameter: the string for creating the object. That parameter will be stored in the object attribute: **self.strbases**
Let's modify the \__init_\_ method to include a new parameter: the string that represents the sequence of the objetc. That parameter will be stored in the object attribute: **self.strbases**.
```python
```python
classSeq:
classSeq:
...
@@ -219,15 +219,15 @@ s1 = Seq("AGTACACTGGT")
...
@@ -219,15 +219,15 @@ s1 = Seq("AGTACACTGGT")
s2=Seq("CGTAAC")
s2=Seq("CGTAAC")
```
```
If you **execute** it, you will see the **same output** as before, but now something has happened. The two objects created have their own **sequence string** stored in the **strbases** attribute. Let's debug it to see it
If you **execute** it, you will see the **same output** as before, but now something has happened. The two objects created have their own **sequence string** stored in the **strbases** attribute. Let's debug it to see it.
Each object has **its own sequence**! If you **debug** it using the **step into** option, you will see that the sequence is stored in the object when the **self.strbases = strbases** line is executed
Each object has **its own sequence**! If you **debug** it using the **step into** option, you will see that the sequence is stored in the object when the **self.strbases = strbases** line is executed.
### The \__str_\_ method
### The \__str_\_ method
There is another special method, called **\__str_\_** that is invoked whenever the object is **printed**. Printing our objects means that we want to see the **sequence** on the **console**
There is another special method, called **\__str_\_** that is invoked whenever the object is **printed**. In this example, printing our objects means that we want to see the **sequence** on the **console**.
```python3
```python3
class Seq:
class Seq:
...
@@ -268,11 +268,11 @@ Sequence 2: CGTAAC
...
@@ -268,11 +268,11 @@ Sequence 2: CGTAAC
Testing....
Testing....
```
```
In the **debug mode**, if the **step into** option is pressed when the next instruction to be executed is the first print, you will see how the execution pointer moved into the **\__str_\_ method**
In the **debug mode**, if the **step into** option is pressed when the next instruction to be executed is the first print, you will see how the execution pointer moved into the **\__str_\_ method**.
### Adding methods: len()
### Adding methods: len()
Let's add a new method: **len()** for calculating the **length of the sequence**. As it is a method, the first parameter must be **self**. For calculating the length, we will use the **len()** function (because in this example the type we are using for storing the bases is a string)
Let's add a new method: **len()** for calculating the **length of the sequence**. As it is a method, the first parameter must be **self**. For calculating the length, we will use the **len()** function (because in this example the type we are using for storing the bases is a string).
```python3
```python3
class Seq:
class Seq:
...
@@ -308,7 +308,7 @@ print(f"Sequence 2: {s2}")
...
@@ -308,7 +308,7 @@ print(f"Sequence 2: {s2}")
print(f" Length: {s2.len()}")
print(f" Length: {s2.len()}")
```
```
Once the objects have been created, their lengths can be calculated just by calling the **len() method**. For doing so, we have to write a **point** and the the **name** of the method: **s1.len()**. The meaning of this is: "Execute the action for calculating the length on the s1 object"
Once the objects have been created, their lengths can be calculated just by calling the **len() method**. For doing so, we have to write a **point** and the the **name** of the method: **s1.len()**. The meaning of this is: "Execute the action for calculating the length on the s1 object".
**Run** the program. This is what you should see on the **console**:
**Run** the program. This is what you should see on the **console**:
...
@@ -323,7 +323,7 @@ Sequence 2: CGTAAC
...
@@ -323,7 +323,7 @@ Sequence 2: CGTAAC
### Inheritance
### Inheritance
New classes can be **derived** from others, reusing their methods and adding new ones. This is called **inheritance**. Just to present the concepts, Let's create the **Gene class**, that derives (inherits) from Seq. It will not add anything
New classes can be **derived** from others, reusing their methods (inherited from the parent) and adding new ones. This is called **inheritance**. Just to present the concepts, let's create the **Gene class**, that derives (inherits) from Seq. It will not add anything.
```python3
```python3
class Gene(Seq):
class Gene(Seq):
...
@@ -334,7 +334,7 @@ class Gene(Seq):
...
@@ -334,7 +334,7 @@ class Gene(Seq):
pass
pass
```
```
Now we can create objects from the **Gene class**. This objects will have the same methods than the **Seq objects**. We say that these methods have been inherited from the Seq class
Now we can create objects from the **Gene class**. This objects will have the same methods than the **Seq objects** since we did not do anything but inherit from the Seq class. We say that these methods have been inherited.
```python3
```python3
# --- Main program
# --- Main program
...
@@ -361,7 +361,7 @@ Gene: CGTAAC
...
@@ -361,7 +361,7 @@ Gene: CGTAAC
#### Expanding the Gene Class
#### Expanding the Gene Class
The **Gene** is a kind of **specialized sequence**. Not all the sequences are Genes, but all the genes are sequences. We can **expand** the **Gene class** by adding more information that is not present in the general sequences. For example the **name** (usually the genes have a special name)
The **Gene** is a kind of **specialized sequence**. Not all the sequences are Genes, but all the genes are sequences. We can **expand** the **Gene class** by adding more information that is not present in the general sequences. For example the **name** (usually the genes have a special name).
Let's create a new \__init_\_ file method that includes a new optional parameter: the name of the Gene
Let's create a new \__init_\_ file method that includes a new optional parameter: the name of the Gene
...
@@ -380,7 +380,7 @@ class Gene(Seq):
...
@@ -380,7 +380,7 @@ class Gene(Seq):
print("New gene created")
print("New gene created")
```
```
As the Gene is also a Seq, for creating a Gene first we should call the **init** function from the Seq class. We do it by calling the **super().init(strbases)** method, and then we add the properties of the Gene. In this case, only its name
As the Gene is also a Seq, for creating a Gene first we should call the **init** function from the Seq class (_super_). We do it by calling the **super().init(strbases)** method, and then we add the properties of the Gene. In this case, only its name.
When creating the Gene in the main program, we specify its **name** as a parameter, for example FRAT1:
When creating the Gene in the main program, we specify its **name** as a parameter, for example FRAT1:
...
@@ -404,11 +404,11 @@ Sequence 1: AGTACACTGGT
...
@@ -404,11 +404,11 @@ Sequence 1: AGTACACTGGT
Gene: CGTAAC
Gene: CGTAAC
```
```
Notice how the **init** function from Seq has been **called twice**, and the init function from the Gene only **once**
Notice how the **init** function from Seq has been **called twice**, and the init function from the Gene only **once**.
#### Overriding the Seq methods
#### Overriding the Seq methods
In the new Gene class, we can create **new methods**, or we can **re-implement** methods that were already implemented in the _mother/father_ class (Seq in this case). This operation of re-implementation is called **overriding**. For example, let's change the \__str_\_ method for **printing** the **Gene name** along with the sequence
In the Gene class, we can either create **new methods** or **re-implement** methods that were already implemented in the _parent_ class (Seq in this case). This operation of re-implementation is called **overriding**. For example, let's change the \__str_\_ method for **printing** the **Gene name** along with the sequence.
```python3
```python3
class Gene(Seq):
class Gene(Seq):
...
@@ -451,21 +451,21 @@ Sequence 1: AGTACACTGGT
...
@@ -451,21 +451,21 @@ Sequence 1: AGTACACTGGT
Gene: FRAT1-CGTAAC
Gene: FRAT1-CGTAAC
```
```
If you have a look to the right side of the \__str_\_ method in the **Gene Class**, you will notice a new icon: a **circle with an arrow** pointing upwards. This means that this methods is **overriding** another implemented in the super class (Seq)
If you have a look to the right side of the \__str_\_ method in the **Gene Class**, you will notice a new icon: a **circle with an arrow** pointing upwards. This means that this methods is **overriding** another implemented in the super class (Seq).
On the other hand, if you have a look at the \__str_\_ method in the **Seq Class**, you will see the previous icon and a **new one** with another circle and **arrow pointing downwards**
On the other hand, if you have a look at the \__str_\_ method in the **Seq Class**, you will see the previous icon and a **new one** with another circle and **arrow pointing downwards**.
This means that there is a **sub-class** that is overriding this method. And that the Seq \__str_\_ method is also overriding another one from its **super class** (generic)
This means that there is a **sub-class** that is overriding this method. And that the Seq \__str_\_ method is also overriding another one from its **super class** (generic).
## Installing libraries: termcolor
## Installing libraries: termcolor
There are many **python libraries** available for you to use. You only have to **install** them. As an example of how to do it, let's install the [termcolor](https://pypi.org/project/termcolor/) library. It will allow us to **print messages in color** in the **console**
There are many **python libraries** available for you to use. You only have to **install** them. As an example of how to do it, let's install the [termcolor](https://pypi.org/project/termcolor/) library. It will allow us to **print messages in color** in the **console**.
Create a **new python file** in the **Session-06** folder and call it **test-color-py**. Write this code:
Create a **new python file** in the **S06** folder and call it **test-color-py**. Write this code:
```python
```python
importtermcolor
importtermcolor
...
@@ -475,29 +475,29 @@ termcolor.cprint("Hey! this is printed in green!", 'green')
...
@@ -475,29 +475,29 @@ termcolor.cprint("Hey! this is printed in green!", 'green')
You will see that the word **termcolor** is underlined in **red**. This is because Pycharm does not know anything about the termcolor module.
You will see that the word **termcolor** is underlined in **red**. This is because Pycharm does not know anything about the termcolor module.
Click on the **install package termcolor** option. The package start the installation. After some seconds it will be ready so the termcolor word will be no longer in red
Click on the **install package termcolor** option. The package start the installation. After some seconds it will be ready so the termcolor word will be no longer in red.
Now **run** the program. You will see a green message:
Now **run** the program. You will see a green message:
You now have the power of printing in colors...use it wisely :)
You now have the power of printing in colors...use it wisely :)
## Exercises
## Exercises
Let's practice with the objects!. Create the **Session-06 folder** if you already has not done it yet. Store there all the files create during this session (exercises and sequences)
Let's practice with classes and objects!. If you have not done it yet, create the **S06** folder.
**NOTE**: Theses exercises are independent one to each other. They do NOT import components from other modules. For modifying the Seq Class just copy & paste the older version into the new file
**NOTE**: Theses exercises are independent one to each other. To modify the Seq Class just copy & paste the older version into a new file.
### Exercise 1
### Exercise 1
The current **Seq class** created as a example in this session does not check if the given string of bases is \*_valid_. Therefore, if we execute the following code in the the main program:
The current **Seq class** created as a example in this session does not check if the given string of bases is **valid**. Therefore, if we execute the following code in the the main program:
```python3
```python3
s1 = Seq("ACCTGC")
s1 = Seq("ACCTGC")
...
@@ -513,10 +513,10 @@ New sequence created!
...
@@ -513,10 +513,10 @@ New sequence created!
New sequence created!
New sequence created!
```
```
The goal of this exercise is to detect the **incorrect sequences**
The goal of this exercise is to detect **incorrect sequences**.
***Filename**: S6/e1.py
***Filename**: S06/e1.py
***Description**: Modify the \__ini_\_ method of the Seq class so that it detects that the given string only have these four valid bases: 'A', 'C', 'G' and 'T'. If a different character is found, the sequence should be initialized with the **"ERROR"** string, and the message **"INCORRECT Sequence detected"** should be printed in the console
***Description**: Modify the \__ini_\_ method of the Seq class so that it detects whether the given string only contains valid bases ('A', 'C', 'G' and 'T'). If a different character is found, the sequence should be initialized with the **"ERROR"** string, and the message **"INCORRECT Sequence detected"** should be printed in the console.
When the previous **main program** is executed, this is what should be printed on the console:
When the previous **main program** is executed, this is what should be printed on the console:
...
@@ -537,7 +537,7 @@ We can create **lists of sequences** very easily. For example, in this list ther
...
@@ -537,7 +537,7 @@ We can create **lists of sequences** very easily. For example, in this list ther
But then we need to develop the function **"** that receives a **list of sequences** and prints their **number** in the list, their **length** and the **sequence** itself. For example, if we call the function with the previous list, the output in the console should be:
But then we need to develop the function **print_seqs(seq_list)** that receives a **list of sequences** and prints their **index** in the list, their **length** and the **sequence** itself. For example, if we call the function with the previous list, the output in the console should be:
***Description**: Implement the print_seqs() function
***Description**: Implement the print_seqs() function (of course, this is not a class method but an separated function)
### Exercise 3
### Exercise 3
We need to develop some **functions** to create sequences for **testing** the Seq objects. For instance, the function **generate_seqs(pattern, number)**, that has two parameters, will create a **list** with the provided _number_ of sequences. All the sequences are created from the given **pattern**. This pattern is a string of one or more bases. The first sequence of the list will have the provided pattern, the second, the pattern will be repeated twice, in the third the patter will be repeated three times, and so on
We need to develop some **functions** to create sequences for **testing** the Seq objects. For instance, the function **generate_seqs(pattern, number)**, that has two parameters, will create a **list** with the provided _number_ of sequences. All the sequences are created from the given **pattern**. This pattern is a string of one or more valid bases. The first sequence of the list will have the provided pattern, the second, the pattern will be repeated twice, in the third the patter will be repeated three times, and so on.
Therefore, if we call the function generate_seqs() with the parameters ("A", 3), a list of 3 sequences is returned. The bases in every sequence will be: "A", "AA" and "AAA"
Therefore, if we call the function generate_seqs() with the parameters ("A", 3), a list of 3 sequences is returned. The bases in every sequence will be: "A", "AA" and "AAA"
***Filename**: S6/e.py
***Filename**: S06/e.py
***Description**: Implement the generate_seqs() function. Test the function with this main program:
***Description**: Implement the generate_seqs() function. Test the function with this main program:
Let's play with the **termcolor** module. Modify exercise 3 for printing both lists in different color: print list 1 in **blue**, and list 2 in **green**
Let's play with the **termcolor** module. Modify exercise 3 for printing both lists in different colors: print list 1 in **blue**, and list 2 in **green**.
You should modify the **print_seqs()** function for including an additional parameter: the color
You should modify the **print_seqs()** function for including an additional parameter: the color.
***Filename**: S6/e4.py
***Filename**: S06/e4.py
***Description**: The same output than e3.py, but in colors. This is how it should looks like:
***Description**: The same output than e3.py, but in colors. This is how it should looks like: