Introducing Data Science by Davy Cielen, Arno D. B. Meysman, Mohamed Ali
INTRODUCING DATA SCIENCE is a broad introduction to the field. Each chapter includes the theory, as well as practical examples. This is a big, complex field, and this book will take some time to absorb. For example, the authors illustrate in Chapter 6 the large number of database products that are used in this field. These include both "NoSQL," as well as "New SQL" designs. There isn't just one "right" method. To be honest, I didn't know there was any such thing as "graphical" databases.
I recommend trying some of the detailed examples to get a feel for the subject matter. For example, the authors show, in detail, how to use Wikipedia with a custom Python program to try some data mining. The example pretends that you are trying to research diseases, using information in Wikipedia, and show you how to write your program.
I thought the best chapter was Chapter 6, "Join the NoSQL Movement." Here, the authors explain the intrinsic limitations of the traditional RBBMS structure. A typical RDBMS table is stored with ALL the columns together. This often works well, but what if you only want certain columns? Well, that's too bad--you will end up "touching" all the other columns as well.
Column databases don't have the above limitation. So, they are far faster are scanning through large amounts of information, when you just want certain columns. (As an Oracle DBA for several decades, I can confirm that the authors correctly state the limitation.)
So all in all, I found INTRODUCING DATA SCIENCE to be a very good book--albeit a bit overwhelming. I found the examples especially helpful. The book has several appendices that explain how to install required libraries for use in the chapter examples.