What is a Data Structure? Definition and Importance Explained
Table of Content:
Even though computers can perform literally millions of mathematical computations per second, when a problem gets large and complicated, performance can nonetheless be an important consideration. One of the most crucial aspects of how quickly a problem can be solved is how the data is stored in memory. To illustrate this point, consider going to the local library to find a book about a specific subject matter. Most likely, you will be able to use some kind of electronic reference or, in the worst case, a card catalog, to determine the title and author of the book you want. Since the books are typically shelved by category, and within each category sorted by author’s name, it is a fairly straightforward and painless process to then physically select your book from the shelves. Now, suppose instead you came to the library in search of a particular book, but instead of organized shelves, were greeted with large garbage bags lining both sides of the room, each arbitrarily filled with books that may or may not have anything to do with one another. It would take hours, or even days, to find the book you needed, a comparative eternity. This is how software runs when data is not stored in an efficient format appropriate to the application.
Data Structure is a way of collecting and organising data in such a way that we can perform operations on these data in an effective way. Data Structures is about rendering data elements in terms of some relationship, for better organization and storage. In simple language, Data Structures are structures programmed to store ordered data, so that various operations can be performed on it easily. It represents the knowledge of data to be organized in memory. It should be designed and implemented in such a way that it reduces the complexity and increases the effieciency.
The Need for Data Structures
You might think that with ever more powerful computers, program efficiency is becoming less important. After all, processor speed and memory size still continue to improve. Won’t any efficiency problem we might have today be solved by tomorrow’s hardware?
As we develop more powerful computers, our history so far has always been to use that additional computing power to tackle more complex problems, be it in the form of more sophisticated user interfaces, bigger problem sizes, or new problems previously deemed computationally infeasible. More complex problems demand more computation, making the need for effective programs even greater. Worse yet, as tasks become more complex, they become less like our everyday experience. Today’scomputer scientists must be trained to have a thorough understanding of the principles behind efficient program design because their ordinary life experiences often do not apply when designing computer programs.
In the most general sense, a data structure is any data representation and its associated operations. Even an integer or floating point number stored on the computer can be viewed as a simple data structure. More commonly, people use the term data structure to mean an organization or structuring for a collection of data items. A sorted list of integers stored in an array is an example of such a structuring.
Given sufficient space to store a collection of data items, it is always possible to search for specified items within the collection, print or otherwise process the data items in any desired order, or modify the value of any particular data item. Thus, it is possible to perform all necessary operations on any data structure. However, using the proper data structure can make the difference between a program running in a few seconds and one requiring many days.
A solution is said to be efficient if it solves the problem within the required resource constraints. Examples of resource constraints include the total space available to store the data — possibly divided into separate main memory and disk space constraints — and the time allowed to perform each subtask. A solution is sometimes said to be efficient if it requires fewer resources than known alternatives, regardless of whether it meets any particular requirements. The cost of a solution is the number of resources that the solution consumes. Most often, the cost is measured in terms of one key resource such as time, with the implied assumption that the solution meets the other resource constraints.
It should go without saying that people write programs to solve problems. However, it is crucial to keep this truism in mind when selecting a data structure to solve a particular problem. Only by first analyzing the problem to determine the performance goals that must be achieved can there be any hope of selecting the right data structure for the job. Poor program designers ignore this analysis step and apply a data structure that they are familiar with but which is inappropriate to the problem. The result is typically a slow program. Conversely, there is no sense in adopting a complex representation to “improve” a program that can meet its performance goals when implemented using a simpler design.
When selecting a data structure to solve a problem, you should follow these steps.
- Analyze your problem to determine the basic operations that must be supported. Examples of basic operations include inserting a data item into the data structure, deleting a data item from the data structure, and finding a specified data item.
- Quantify the resource constraints for each operation.
- Select the data structure that best meets these requirements.
This three-step approach to selecting a data structure operationalizes a data-centered view of the design process. The first concern is for the data and the operations to be performed on them, the next concern is the representation for those data, and the final concern is the implementation of that representation.
Resource constraints on certain key operations, such as search, inserting data records, and deleting data records, normally drive the data structure selection process. Many issues relating to the relative importance of these operations are addressed by the following three questions, which you should ask yourself whenever you must choose a data structure:
- Are all data items inserted into the data structure at the beginning, or are insertions interspersed with other operations? Static applications (where the data are loaded at the beginning and never change) typically require only simpler data structures to get an efficient implementation than do dynamic applications.
- Can data items be deleted? If so, this will probably make the implementation more complicated.
- Are all data items processed in some well-defined order, or is a search for specific data items allowed? “Random access” search generally requires more complex data structures.
Characteristics of a Data Structure
-
Correctness ? Data structure implementation should implement its interface correctly.
-
Time Complexity ? Running time or the execution time of operations of data structure must be as small as possible.
-
Space Complexity ? Memory usage of a data structure operation should be as little as possible.
Basic Terminology
-
Data ? Data are values or set of values.
-
Data Item ? Data item refers to single unit of values.
-
Group Items ? Data items that are divided into sub items are called as Group Items.
-
Elementary Items ? Data items that cannot be divided are called as Elementary Items.
-
Attribute and Entity ? An entity is that which contains certain attributes or properties, which may be assigned values.
-
Entity Set ? Entities of similar attributes form an entity set.
-
Field ? Field is a single elementary unit of information representing an attribute of an entity.
-
Record ? Record is a collection of field values of a given entity.
-
File ? File is a collection of records of the entities in a given entity set.
List out the areas in which data structures are applied extensively?
- Compiler Design
- Operating System
- Database Management System
- Statistical analysis package
- Numerical Analysis
- Graphics
- Artificial Intelligence
- Simulation