What Are Data Structures?
Every program you've ever used — from a simple calculator to a social media feed — relies on data structures under the hood. A data structure is simply a way of organizing and storing data in a computer so that it can be accessed and modified efficiently.
Think of it like a toolbox. You could throw all your tools into one pile, but finding the right one would take forever. Instead, you organize them: screwdrivers in one drawer, wrenches in another, and so on. Data structures bring that same concept to your code.
Choosing the right data structure can mean the difference between a program that responds in milliseconds and one that takes minutes to finish. That's why understanding them is considered a foundational skill for every software engineer.
Arrays — The Simplest Structure
An array is an ordered collection of elements stored at contiguous memory locations. It is the most basic data structure and serves as the foundation for many others.
Arrays give you instant access to any element using its index (position), making reads extremely fast. The trade-off is that inserting or deleting elements in the middle is slow because every element after the insertion point must be shifted.
Array basics in Python
python
# Creating an array (list in Python)
fruits = ["apple", "banana", "cherry"]
# Access by index — O(1)
print(fruits[0]) # apple
# Append to end — O(1)
fruits.append("date")
# Insert in middle — O(n), shifts everything after
fruits.insert(1, "avocado")
# Delete by value — O(n)
fruits.remove("banana")
print(fruits) # ['apple', 'avocado', 'cherry', 'date']When to Use an Array
Use arrays when:
— You need fast access to elements by position.
— The size of the data is known or doesn't change often.
— You're iterating over all elements sequentially.
Avoid arrays when you need frequent insertions or deletions in the middle of the collection — a Linked List or a different structure will serve you better.
Linked Lists — Flexible Chains of Data
A linked list is a collection of nodes where each node holds a value and a pointer (reference) to the next node. Unlike arrays, linked list elements don't have to live in contiguous memory.
This makes insertions and deletions at any point very efficient — you simply rewire the pointers. The cost is that you lose instant index-based access; to find the 10th element you have to walk the chain from the beginning.
Linked list node in Python
python
class Node:
def __init__(self, value):
self.value = value
self.next = None # pointer to the next node
class LinkedList:
def __init__(self):
self.head = None
def append(self, value):
new_node = Node(value)
if not self.head:
self.head = new_node
return
current = self.head
while current.next:
current = current.next
current.next = new_node
def display(self):
elements = []
current = self.head
while current:
elements.append(current.value)
current = current.next
print(" -> ".join(str(e) for e in elements))
ll = LinkedList()
ll.append(10)
ll.append(20)
ll.append(30)
ll.display() # 10 -> 20 -> 30Singly vs Doubly Linked Lists
A singly linked list has nodes that point only forward (to the next node). A doubly linked list adds a second pointer that also points backward (to the previous node).
Doubly linked lists allow traversal in both directions and make deletion of a known node faster, but they consume more memory due to the extra pointer per node.
Stacks and Queues — Order Matters
Stacks and Queues are abstract data types built on top of arrays or linked lists. Their power comes from strictly controlling how data enters and exits.
Stack — Last In, First Out (LIFO)
A stack works like a pile of plates: you always add and remove from the top. The last item pushed on is the first one popped off.
Real-world uses: function call stacks, undo/redo history, expression parsing.
Stack in Python
python
stack = []
# Push
stack.append("page1")
stack.append("page2")
stack.append("page3")
# Pop — removes the last added item
back = stack.pop()
print(back) # page3
print(stack) # ['page1', 'page2']Queue — First In, First Out (FIFO)
A queue works like a line at a coffee shop: the first person who joined is the first to be served. Items are added at the back and removed from the front.
Real-world uses: task schedulers, print queues, breadth-first graph search.
Queue in Python
python
from collections import deque
queue = deque()
# Enqueue
queue.append("task_1")
queue.append("task_2")
queue.append("task_3")
# Dequeue — removes the first added item
next_task = queue.popleft()
print(next_task) # task_1
print(queue) # deque(['task_2', 'task_3'])Hash Tables — Lightning-Fast Lookups
A hash table (called a dictionary in Python or an object in JavaScript) maps keys to values using a hash function. The hash function converts a key into an index, telling the computer exactly where to store or retrieve the value in memory.
This makes lookups, insertions, and deletions all O(1) on average — essentially instant, regardless of how much data you have. That's why hash tables are one of the most heavily used data structures in real software.
Hash table (dict) in Python
python
# Creating a hash table
phonebook = {
"Alice": "555-1234",
"Bob": "555-5678",
"Carol": "555-9999",
}
# Lookup — O(1) average
print(phonebook["Alice"]) # 555-1234
# Insert
phonebook["Dave"] = "555-0000"
# Delete
del phonebook["Bob"]
# Check existence
if "Carol" in phonebook:
print("Found Carol!")Handling Collisions
A collision happens when two different keys hash to the same index. Two common strategies to handle this are:
1. Chaining — each slot in the table holds a list; colliding entries are added to that list.
2. Open Addressing — if a slot is taken, the algorithm probes nearby slots until it finds an empty one.
Modern languages handle this for you automatically, but understanding it helps you reason about performance in edge cases.
Trees — Hierarchical Data
A tree is a hierarchical structure made of nodes. Each tree has one root node at the top, and every node can have zero or more child nodes. Nodes with no children are called leaves.
Trees are everywhere: your computer's file system, the DOM in your browser, decision algorithms in AI, and database indexes all rely on tree structures.
Binary Search Tree (BST)
A Binary Search Tree is a special tree where each node has at most two children (left and right), and every node obeys one rule: all values in the left subtree are smaller, and all values in the right subtree are larger.
This property makes searching for a value extremely efficient — at each node you can immediately discard half the remaining tree.
Binary Search Tree in Python
python
class BSTNode:
def __init__(self, value):
self.value = value
self.left = None
self.right = None
def insert(root, value):
if root is None:
return BSTNode(value)
if value < root.value:
root.left = insert(root.left, value)
else:
root.right = insert(root.right, value)
return root
def search(root, value):
if root is None or root.value == value:
return root
if value < root.value:
return search(root.left, value)
return search(root.right, value)
# Build a small BST
root = None
for v in [50, 30, 70, 20, 40]:
root = insert(root, v)
result = search(root, 40)
print(result.value if result else "Not found") # 40Choosing the Right Data Structure
There is no single 'best' data structure — each one is a trade-off. Here is a practical cheat-sheet to guide your decisions:
| Need | Best Pick |
|-----------------------------------------|--------------------|
| Fast access by position | Array |
| Frequent middle insertions/deletions | Linked List |
| Undo history / function calls | Stack |
| Task scheduling / BFS | Queue |
| Fast key-value lookup | Hash Table |
| Hierarchical / sorted data | Tree / BST |
When in doubt, start with the simplest structure that satisfies your need, profile your code, and upgrade only when performance becomes a real problem.
What to Learn Next
This article covered the most essential data structures every developer should know. Once you're comfortable with these, the natural next step is to explore:
— Graphs: like trees but with cycles, used for maps and social networks.
— Heaps: a tree-based structure great for priority queues.
— Tries: specialized trees for fast string/prefix lookups.
— Big O Notation: the formal language for measuring and comparing the efficiency of these structures.
Related Links
Visualgo — Visualize Data Structures
An interactive site that animates how data structures and algorithms work step by step.
Big-O Cheat Sheet
A quick reference for the time and space complexity of every common data structure operation.
CS50 — Introduction to Computer Science (Harvard)
Harvard's free introductory CS course — covers data structures with excellent visual explanations.
LeetCode — Practice Problems
Hands-on coding challenges organized by data structure so you can practice what you've learned.