Python's Normal and Data class object: What's the difference and which should you choose?
Hey ๐๐ฟ, it's been a while I put out a technical piece on anything. I was jaded from things happening around me but now I'm back and I've got loads of topic and concept in tech I want to share. For starters, while researching on Python's class data, I decided to write a technical piece about it. Hope you fine it useful.
Python is an object-oriented programming language that provides different ways to define and use classes. Two popular options are data classes and normal classes. In this article, we'll explore the differences between these two types of classes and help you decide which one to use in your Python code.
What are Normal Classes?
In Python, a class is a blueprint for creating objects. A normal class is a user-defined class that doesn't have any special attributes or methods. It has a constructor method __init__
that initializes the class instance with default or user-provided values. A normal class can also have additional methods defined within its scope.
Here is an example of a normal class in Python:
class Dog:
def __init__(self, name, breed, age):
self.name = name
self.breed = breed
self.age = age
What are Data Classes?
A data class is a special type of class that is designed to store data. In Python, data classes are created using the @dataclass decorator. They are meant to be used for simple, immutable data structures that have no behavior.
Here is an example of a data class in Python:
from dataclasses import dataclass
@dataclass
class Person:
name: str
age: int
In the code above, The Person
data class using the @dataclass
decorator. We also defined two fields name and age which are annotated with their respective types.
differences between Normal and Data classes
1- Conciseness: Data classes are more concise than normal classes as they require less code to define. The fields are automatically generated, and a __repr__
method is provided for us.
Here is an example that compares the conciseness of a normal class and a data class:
class Point:
def __init__(self, x, y):
self.x = x
self.y = y
def __repr__(self):
return f"Point(x={self.x}, y={self.y})"
@dataclass
class Point2:
x: int
y: int
In the above example, the Point class has a constructor method and a __repr__
method, while the Point2 data class only has the two fields x and y. The __repr__
method is automatically generated for Point2
.
2- Boilerplate code: One of the biggest differences between normal classes and data classes is the amount of boilerplate code needed to define them.
When defining a normal class, we need to define the __init__
method, as well as getters and setters for each attribute, if we want to be able to access and modify them from outside the class. This can quickly become cumbersome when dealing with classes with many attributes.
Here's an example of a normal class that defines two attributes, name and age:
class Person:
def __init__(self, name, age):
self.name = name
self.age = age
def get_name(self):
return self.name
def set_name(self, name):
self.name = name
def get_age(self):
return self.age
def set_age(self, age):
self.age = age
On the other hand, when defining a data class, we only need to define the attributes as class variables, and the boilerplate code is generated for us automatically. This makes data classes much more concise and easier to read.
Here's an example of a data class that defines the same two attributes, name and age:
from dataclasses import dataclass
@dataclass
class Person:
name: str
age: int
3- Immutability: Another key difference between normal classes and data classes is immutability. By default, data classes are immutable, meaning that their attributes cannot be modified after they are created.
This is achieved by adding the frozen=True
argument to the dataclass
decorator:
from dataclasses import dataclass
@dataclass(frozen=True)
class Person:
name: str
age: int
In contrast, normal classes are mutable by default, meaning that their attributes can be modified at any time. While mutability can be useful in certain cases, it can also make it difficult to reason about the state of an object, particularly in multi-threaded environments.
4- Type Hinting: Type hinting is another area where data classes and normal classes differ. With data classes, we can specify the types of our attributes in the class definition. This is done using Python's built-in type annotations. Here's an example:
from typing import List
@dataclass
class Person:
name: str
age: int
hobbies: List[str]
As you can see, we specified the types of the name
, age
, and hobbies
attributes. This allows Python's type checker to catch any type errors during development, which can help prevent bugs and improve the reliability of our code.
In contrast, with normal classes, we have to manually specify the types of our attributes in the constructor or in the class definition. Here's an example:
from typing import List
class Person:
def __init__(self, name: str, age: int, hobbies: List[str]):
self.name = name
self.age = age
self.hobbies = hobbies
4 Equality: By default, data classes define equality based on the values of their attributes, meaning that two instances of a data class are considered equal if all their attributes have the same values.
from dataclasses import dataclass
@dataclass
class Person:
name: str
age: int
p1 = Person("Alice", 30)
p2 = Person("Alice", 30)
print(p1 == p2) # True
In contrast, normal classes define equality based on object identity, meaning that two instances of a normal class are considered equal only if they are the same object in memory.
class Person:
def __init__(self, name, age):
self.name = name
self.age = age
p1 = Person("Alice", 30)
p2 = Person("Alice", 30)
print(p1 == p2) # False
5 Overriding default methods: Data classes provide default implementations for several methods such as __init__()
, __repr__()
, __eq__()
, __ne__()
, __hash__()
etc. These default implementations are generated based on the attributes defined in the class. This allows us to avoid writing boilerplate code and focus on the logic of our program.
On the other hand, with normal classes, we have to define these methods ourselves if we want to use them. Here's an example of a normal class that implements these methods:
class Person:
def __init__(self, name: str, age: int):
self.name = name
self.age = age
def __repr__(self):
return f"Person(name='{self.name}', age={self.age})"
def __eq__(self, other):
if isinstance(other, Person):
return self.name == other.name and self.age == other.age
return False
def __ne__(self, other):
return not self.__eq__(other)
def __hash__(self):
return hash((self.name, self.age))
Conclusion
Data classes and normal classes are both used in Python to represent objects. Data classes are a more concise and readable way of defining classes that are primarily used to store data, whereas normal classes are more flexible and customizable.
Data classes are preferred when we have simple data structures and we want to avoid writing boilerplate code. They also provide several default methods such as __init__()
, __repr__()
, __eq__()
, __ne__()
, __hash__()
that we don't have to write ourselves.
Normal classes, on the other hand, are more flexible and customizable. They allow us to define our own methods and attributes and provide us with complete control over our classes.
In summary, data classes are great for simple data structures and normal classes are more appropriate when we need more flexibility and control over our classes.
PS: Did you find this useful? If so then follow me for article updates and tweets โจ