• Published: August 18th, 2009
  • Comments: no responses

About the Different Database Driver Architectures

Tags: , ,

In this podcast Rob Steward explains the differences between database drivers. The podcast runs for 6:37.

Click on the following link to listen to the podcast: http://blogs.datadirect.com/media/RobSteward_DatabaseDriverArchitecture.mp3

Rob, you talk a lot about the different database driver architectures within The Data Access Handbook. Will you provide a little bit of insight about what differences there are between these different types of architectures, for instance bridge, client based, database wire protocol and independent protocol architectures?

Rob Steward:

Sure Mike. There are certainly a number of different ways to build a database driver or provider. Some of the APIs that we talk about in The Data Access Handbook differentiate between the architectures of the driver. For example, in the JDBC specification, it actually describes the differences between what it calls a type 1, a type 2, a type 3 and a type 4 driver. I’m going to talk about this from that type 1 through type 4 metaphor; however, I will apply it to ODBC and to ADO.NET, as well as JDBC. So a type 1 JDBC driver is really just a bridge driver between JDBC and ODBC. So there really was only one type 1 JDBC driver ever built, and that’s what it was. It was shipped with Sun’s Virtual Machine (VM), and it was the JDBC to ODBC bridge. Really all it was was this thin Java layer that translated from the JDBC calls to ODBC calls. The vast majority of the work was done in native code outside of the Java Virtual Machine (VM), and that’s kind of a good example of a bridge architecture. When I say bridge I mean something that translates from one API to another. Typically when we talk about data access, that bridge is from one API – one standards based API to another standards based API, or some native database API.

Another example of a bridge provider is, Microsoft produced an OLE DB to ODBC bridge provider for OLE DB. And most of us know that as MSDASQL, or COGERO was the codename for that. A lot of people in the industry call it COGERO. That’s another bridge provider. There are also bridge providers within the .NET framework for translating from ADO.NET to ODBC, and translating from ADO.NET to OLE DB. There’s a lot of other bridges out there that I could probably talk about, but I think you get the idea from the examples.

The thing about bridge providers is that they’re typically inefficient, because the real work doesn’t go on in the bridge; the real work goes on in the underlying provider, whatever it is. But you’re just sort of adding an additional layer of overhead. In the Java world – in the JDBC world – there was one type 1 driver, and that was the JDBC to ODBC bridge; however, it really only existed because not a lot of JDBC drivers were in existence when the specification was first released. So that allowed people to write JDBC code and use their existing ODBC drivers underneath it in order to get started with JDBC. Now no one in the industry is going to tell you to use that bridge provider in a case where you have a type 2 or a type 4 JDBC driver in existence, because the bridge tends to do things in an inefficient way. It tends to just add overhead that you don’t need.

A type 2 driver would be, some portion of the driver exists in Java – or is pure Java coded – and it sits on top of some native component that does the communication for the database. In a case of type 2, you’ve got partially managed code in .NET terms, or partially Java code in Java terms, and partially native whatever operating system you’re running on. And there are a lot of type 2 drivers on the market today. In that case, take the example of IBM’s type 2 driver for DB2 or Oracle’s type 2 driver for Oracle, those have a significant component that is Java. There is actually some work that happens in Java. But they sit on top of, in the case of Oracle, the Oracle Networking Software. It takes your JDBC call, it does some work, calls outside of the VM for those unmanaged components.

Same thing exists in the .NET world where you may have some portion of the data provider – the ADO.NET data provider – that runs within the Common Language Runtime (CLR), and then some other native component that runs on Windows that it calls out to. So that’s kind of a type 2 driver.

A type 3 solution – which there are a few but not a whole lot in the world – as defined by the JDBC specification means it’s a pure Java driver, but it talks to some server component which then in turn talks to the database. So you have a client included in your application, and it is pure Java. There are no native components that you have to distribute that are OS dependent, but there are some server components that you have to set up. So that driver – pure Java – talks to its server component, which then in turn talks to your SQL Server, or your Oracle, or your DB2 server or MySQL, or whatever it may be. So that’s a type 3.

A type 4 driver – in JDBC terms – is pure Java. It opens up network sockets, and talks directly to the database. It never calls native OS code. It stays purely within the VM. So that’s kind of a type 4. Same thing exists for ADO.NET where you have a 100% managed ADO.NET provider. It will stay completely within the CLR.

  • Published: May 5th, 2009
  • Comments: no responses

Criteria for a Highly Effective Database Driver: Part 1 ODBC

Tags: , , , , , , , , ,

In this two-part podcast Rob Steward discusses what to look for in a highly effective database driver. Part 1, which runs 3:49, focuses on ODBC, while Part 2 will concentrate on JDBC.

Click on the following link to listen to the podcast: http://dataaccesshandbook.com/media/Rob9.mp3

Rob Steward:

Well of course I’ve spent my career developing, producing database drivers of all sorts: ODBC, JDBC, LAB and ADO.NET. And I’ve spent my career looking at different implementation, and looking at the way these things are written and how they interact with the databases. And I can tell you unequivocally that they make a huge difference in terms of your overall performance, reliability, and scalability of your data access applications.

There are several things that you need to look at when you look at those drivers from a cursory glance – or from a quick look – at a driver, what separates one from the other? The first thing I’d say is the architecture of that driver. ODBC was the first real standards based API that came out for which companies built drivers. That driver, what it does eventually is translated from that common API into some underlying thing that talks to a database or a data source. Data drivers do more than just that translation, but I’ll concentrate on that.

What it translates to underneath, when ODBC was originally released – or if you look at say JDBC when it originally came out and there were really just the beginning of JDBC drivers as well – the way they were written was you had this driver piece that sat on top of some other client API from the database vendors themselves. So let’s say in the case of Oracle you may have an ODBC driver, which is this layer that does this translation – it handles the error mappings, and it handles characters and translations, and all kinds of other things that it does. But it sits on top of what’s called OCI from Oracle – the client interface that Oracle has – and then that actually talks across the network to the database server, to the Oracle server. Same thing is true with the DB2, or with SQL server, or Sybase or Informix or any of the other ones we’re familiar with.

One big difference there are with drivers is whether that driver can talk directly to the database across the network, or whether it needs to sit on top of some other layer. What we call that in ODBC terms, we’ll call that a wire protocol driver. So if a driver doesn’t need any of those underlying client pieces from the database vendor, and it can directly open up the TCP/IP Docket or same pipe or something to the database, then we call that a wire protocol driver. It is a standalone piece that you can put with your application that can talk directly to the database. It doesn’t need some instillation of some other piece, which can cause you all kinds of versioning issues. Particularly we see it with Oracle where people say, ‘well I need a particular version of the ODBC driver, but I also need some particular version of this client piece, but I have another application that sits on the same machine that needs a different version of that client piece or a different version of the ODBC driver. And what you end up with is a big mess where it’s difficult in a Windows environment to use all of those things on the same machine.


  • Published: April 21st, 2009
  • Comments: no responses

Why The Data Access Handbook is a Must-Have Resource for Architects, Programmers and DBAs

Tags: , , , , , , , , , , ,

In this podcast Rob Steward explains who The Data Access Handbook was written for, and what benefits they will get out of its content. The podcast runs for 3:00.

You can listen to the podcast by clicking on the following link: http://dataaccesshandbook.com/media/Rob7.mp3

From the Podcast:
Why would the development audience and the database architect be interested in The Data Access Handbook?

Rob Steward:

We wrote the book specifically targeting architects, programmers and DBAs. That may seem like a broad audience, but the reason that I can say that we targeted all three of them is that the book has different sections. The beginning of it we talked more towards the architect, but it’s also beneficial for the DBA or the programmer, in that we go through the concepts of what is middleware and why does it affect things. We talk about how it actually interacts with the network. How it talks to the database. What does that mean in your overall architecture? What are the architecture choices that you make, specifically concerning the middleware and how you access your data that have a great impact on your overall performances? And that’s kind of targeted again towards architects, but also to programmers.

And then we’ve got sections of the book – there’s three chapters where we go through very specific code examples using ODBC, JDBC and ADO.NET – and those three chapters are meant to be a reference for programmers to say, ‘If we take the concepts that we’ve gone over in the beginning of the book, how to we apply those within those standards?’ What does it mean when I say, ‘you need to make sure that you manage your transactions correctly?’ What does that actually mean when it comes to writing codes ODBC and JDBC? Therefore, very specific content for programmers, but, again, backup so the architects who first look at the problem or the applications, they understand on a conceptual level of how this thing needs to be designed. And then for the programmers, what are those specifics to code that you want to write to implement the architectures that we’ve talked about.

And then for DBAs – you know I talk to DBAs all the time – and DBAs are frustrated with the same thing; I’ve been told over and over by DBAs, ‘Well everybody always blames everything on the database; that it’s always in the database.’ And those guys, the DBAs, are probably the ones that know best that the problem’s not always in the database. Because they can tune and they can do these things, and it doesn’t make any difference. So they know it’s somewhere outside of the database, but again they’re not the programmers. But what this book does for DBAs is explain some of these things that the programmers are doing that has that big effect on the overall performance. So they can understand when those problems are on in the database, what kind of problems those could be. And what do they need to tell people to look for in their applications.

So I can say that we targeted this book at architects, programmers and DBAs, and I think there’s a lot in this book for all three of ‘em.


  • Published: April 14th, 2009
  • Comments: no responses

Don’t Overlook the Importance of Middleware to Database Application Performance

Tags: , , , , , , , , , , ,

In this podcast Rob Steward explains why architects, designers, programmers, or DBAs have overlooked the middleware to improve database performance, and why he wrote The Data Access Handbook to help educate the community on the middleware’s importance. The podcast runs for 2:42.

To listen to the podcast, please click on the following link: http://dataaccesshandbook.com/media/Rob6.mp3

Podcast text:

Why was there such latency in focusing on middleware for these performance and connectivity issues?

Rob Steward:

Well I think that the reason many people overlook middleware network as a performance issues is because you can walk into Barns & Noble, or any Boarders, or you can go on Amazon, and you’ll see book after book after book written on how to tune those databases. And you can go to lots and lots of sessions and seminars, and entire conferences built on how do I tune my Oracle database? Or how do I tune my SQL server database? So what you hear as an architect, designer, programmer, or as a DBA is that all the problems occur on the database. And so that’s what the focus has been in our industry. In fact, there is a whole industry built around tuning databases. We all know somebody, who that’s their job. They may be a consultant that gets hired for a couple of months and look over somebody’s database configuration and tuning, and figure that out. Granted, that’s a very important thing. You do need to tune your database.

What’s not been out there is the general knowledge of what the impact of middleware can actually be. Now I sell software. I work for a software company that makes database connectivity, which is part of why I know all these things about the middleware, but I see it over and over. Now I’ve sold a lot of software because that middleware is actually the problem, and not the database.

And again I think the reason why that people haven’t realized it – or as much as they should – is because everything you read, and if you listen to the experts, they’re going to tell you all your problems on your database. Then if they can’t solve it by tuning, or doing the things that they do, they’re going to say you need better hardware. Well I’ve seen hundreds and thousands of times, literally, that that’s not the answer. The actually problem is in that middleware layer.

So I think people tend to learn, but they learn it the hard way. They’ve learned that they’re application doesn’t perform well enough. They’ve done what all the experts say. They may have spent millions of dollars hiring people to come in to try to tune their environment, and it still doesn’t work. And the reason it doesn’t work, again, is because that’s not where their problem was, or it’s where a small part of their problem was.

I guess I would sum it up by saying it is education. And again, the reason I wrote a book on the subject is because there’s just not any information out there, and there’s not people talking about what kind of impact middleware can have.

  • Published: April 7th, 2009
  • Comments: no responses

The Importance of Accessing Data – Including Cloud-Based Data Sources – with Standards-Based APIs

Tags: , , , , , , , , , , , ,

In this podcast, Rob Steward explains what the future holds for database connectivity. The podcast runs for 3:51.

To listen to the podcast, please click on the following link: http://dataaccesshandbook.com/media/Rob5.mp3

Podcast text:

What do you see in the future in database connectivity?

Rob Steward:

Well I think the future is still standards based. We talk specifically in the book about ODBC, JDBC, and ADO.NET. These are standard data access APIs, which in fact most of the world uses to access their data. Now you may use something like a Hibernate or in Hibernate on top of JDBC or ADO.NET, but in a way those are really standard ways to access your data as well. And they’re still sitting on top of those JDBC drivers or those ADO.NET data providers. So I think that the future is still standards based because of all the benefits of using a standard: you run one API instead of learning one API for every database you want to access. But I also think there are some influences on the industry now that will change some of those standard ways to access data.

The biggest thing I see coming is cloud computing. In the cloud it’s still data. Let’s say you use SalesForce.com as your CRM, you still need to take that data and integrate it with your applications and with your data that lives within your firewall. So, I think one of the big influences on the future of connectivity is, how do we get to that access in the standard way to those cloud sources? If you build an application on the Google App Engine, or you build an application on Force.com, or you’re using some FaaS type application like the Salesforce CRM, how do we get to that data and integrate it with everything else in your enterprise? So I think that what we’ll see is the connectivity start to branch out and begin to access those types of forces in those standard ways so that you can plug it into all those applications that you know and love today. You may have a business intelligence application that can handle any OBDC connection, well you still you need that ODBC connectivity to those cloud sources, or a JDBC application, or your .NET applications. How do we get that data from those sources instead of the traditional relational sources into your application, into your enterprise? I think that’s one of the big things.

Now cloud computing and accessing data across the internet or cloud type interfaces has its own unique challenges that will need to be addressed by the connectivity vendors. Personally I’ve been spending a lot of time in that space lately, looking at what the unique challenges are. When you access something across the internet, there are some interesting performance implications when you think about the latency that’s going to be there that doesn’t really exist when you’re inside your firewall. How do you reduce those of number of round trips to get that data? And how do you reduce the number of web service calls that you make to get that data out of the cloud? I think that’s going to have a big influence on the direction of connectivity for the future. Those problems that have to be solved to make these types of architecture, i.e. using cloud source, perform well with what people are used to with absolute control inside your firewall. So that’s what I think is probably the biggest thing that I see that’s going to influence connectivity moving forward. < >< ><–>


Book Content Copyright © 2009 by Prentice Hall PTR. All rights reserved. | Corporate Sponsor DataDirect Technologies.