Skip to main content

Categories of Data Structures

For our purposes I will propose 5 classes of data structures.  These being the array, list, tree, hash, and graph.

Arrays
Arrays are one of the most basic data structures and are characterized by direct access to every element in the structure.  It is easy to picture them as a line of numbered buckets where you can place an element into or remove an element from any bucket.



Lists
Lists are another very basic data structure.  In a list each node has one of more connections to other nodes.  Access to the list is only at one or both ends.  From any given element you can access any element which you have a connection to.  Here is an example of a simple list.

Trees
Trees are similar to lists in that each element has connections to other elements but in a tree the connections are more structured.  Each element in a tree can have children.  The number of children an element can have is an important property of a tree.  The top element in a tree is called the root.  The root in the only element which can be accessed externally.  Any element with no children is called a leaf.  An example of a binary tree might look like this.


Hashes
A hash is similar to an array but the elements are accessed according to a mathematical function.
Sometimes the elements in the hash will have lists coming form them.



Graphs
Unlike the other data structures graphs may or may not have much structure.  We will get to these much later.

 Each of these categories has many variants variants with different properties.
 There are literally hundreds of different data structures and if you've never seen the one you need then you will never know when to use it.  We will examine each of these categories in more detail with following posts.

Comments

Popular posts from this blog

Duck typing considered harmful.

I've had a chance to work on a fairly large chunk of Python at this point and have decided that Python (like Perl) is completely untenable at scale.  <rant><rave><drool>  but wait!  I have reasons! Programmers spend most of their time reading code.  We read massive amounts of code.  After we read massive amounts of code we write one... or change one... To make a change to a piece of code you first have to understand what it does and how it interacts with the system around it.  Then we start reading massive amounts of code again.  Anything you can do to minimize the amount of code a programmer has to understand to make a change becomes a huge gain in productivity. Duck typing causes the amount of code you need to read to make a change to grow very large. For example lets look at two functions, one in C++ and one in Python. First in C++ int function(type1 arg1, type2 arg2) {   return arg1->method(arg2); } In this fun...

Tamiya Tracks

So... a "buddy" of mine got me this for Christmas this year.   I figure... hey, what the heck!  I should be able to get in trouble with this! It makes a nice little vehicle that works quite well.  I hope to turn this into a cool little telepresence rover but during the course of assembly I realized something... No "friend" should every give you something with pieces like this! I am seriously not built for handling pieces that small!

Segmented Sieve

Our next version works just like the previous sieve but works on a segment of the number line at a time. That might be why it is called a  segmented sieve . Now we are sieving multiples of the primes from 1M numbers at a time, adding those primes to the list and then moving on to the next 1M numbers. I hit 140M primes in 60 seconds on my box... but now I run into a new problem.  This version of the sieve quickly consumes all the memory in my box and slows to a crawl.  To progress we will need to find a more efficient way of storing the primes we have found.