Sunday, September 5, 2021

Connecting to Oracle Cloud Autonomous Database through SQLAlchemy

 SQLAlchemy is a powerful Python library that abstracts the communication between Python applications and relational databases. It provides two main components:

  • An SQL abstraction toolkit called the Core that converts Python expressions to SQL statements
  • An optional but widely used Object Relational Mapper (ORM) component that maps Python objects to the Relational DB tables

SQLAlchemy is used widely in data science and analysis for reading data from huge datastores. It is used in tandem with popular Python data libraries such as NumPy and Pandas (you can refer to this informative article!) for efficient data retrieval and processing. In short, SQLAlchemy eliminates the need for writing complex SQL Queries within Python programs for any database operation, which speeds up the productivity and efficiency of the Python applications.

SQLAlchemy works very well with Oracle on-premises and Cloud Autonomous Databases (ADBs) and can seamlessly tap into the unique features of Oracle Database like Session Pooling, DRCP, Shared servers, Sharding, etc.

This post will describe how to

  • make a simple connect to Oracle ADB through SQLAlchemy
  • create client-side session pool connections on the Oracle ADB through SQLAlchemy

To connect your applications to an Oracle Autonomous Cloud Database - using any of the Oracle ADB services like Autonomous Data Warehouse (ADW), Autonomous Transaction Processing (ATP), or the new Oracle Autonomous JSON Database (AJD) - you will need to download and use a 'wallet'. This gives mutual TLS, which provides enhanced security for authentication and encryption.

Pre-requisites

This post assumes you've already created an Oracle Cloud database. But if you don't already have one, you can quickly get an 'Always Free' account for no cost and create a database. There are other posts and documentation that show this, such as this blog. In summary, log in to your cloud account and, from the left-hand menu, select which of the three kinds of database services suit your workload: ADW, ATP, or AJD (Choose ATP if you are unsure). Click through the few prompts and enter the required information, such as the database name. My database is named scrajudb_high.

During database creation, the 'Allow secure access from everywhere' option is selected. This option gives easy access to the database from your Python applications.

During creation, you'll also set a password for the privileged database ADMIN user. (See this blog to understand how to use ADMIN to create a 'normal' un-privileged user).

Also, please check that Python (v3.6 and above) and Oracle Instant Client (latest version) are installed on your local machine.

Download the Oracle Database wallet files

Downloading the wallet is simple. Please check out this excellent article on Connecting to Oracle ADBs by Christopher Jones for the same.

Connecting to an Autonomous Database using DSN in SQLAlchemy

For connecting to the Oracle DB, I will be running my Oracle client on a Windows 10 local machine. Oracle DB's Python driver - cx_Oracle - is used along with SQLAlchemy for connecting to the Oracle ADB. The cx_Oracle installation instructions are available here.

  • Download the wallet ZIP file, as shown in the earlier section

  • Extract the cwallet.ssosqlnet.ora, and tnsnames.ora files. Keep the files in a secure location:

      -rw-r--r--@  1 scraju  staff   6725 15 Aug 00:12 cwallet.sso
      -rw-r--r--   1 scraju  staff    134 15 Aug 10:13 sqlnet.ora
      -rw-r--r--@  1 scraju  staff   1801 15 Aug 00:12 tnsnames.ora
    

    The other files in the ZIP are not needed for the Oracle Python driver (cx_Oracle). They would be used, for example, if you were going to connect via JDBC.

  • There are now two options:

    • Move the three files to the network\admin directory of the client libraries used by your application. For example, if you are using Instant Client 19c, you would put them in C:\Program Files\Oracle\instantclient_19_11\network\admin directory.

    • Alternatively, move them to any secure, accessible directory, for example, C:\Users\scraju\Cloud\DB_WALLETDIR.

      Now here's the important step: Edit sqlnet.ora and change the wallet location directory to the directory containing the cwallet.sso file:

      WALLET_LOCATION = (SOURCE = (METHOD = file) (METHOD_DATA = (DIRECTORY="C:\Users\scraju\Cloud\DB_WALLETDIR")))
      SSL_SERVER_DN_MATCH=yes
      

      Then you need to indicate to the Python Oracle libraries where the tnsnames.ora and sqlnet.ora files are, for example, by setting the environment variable TNS_ADMIN to C:\Users\scraju\Cloud\DB_WALLETDIR. The Python Oracle library (cx_Oracle version 8 and above) will let you pass the directory at runtime using the config_dir argument of the function init_oracle_client(). Neither of these is needed, and you don't need to edit sqlnet.ora, if you have put all the files in the network\admin directory.

Now that the wallet is set, We can connect to the Cloud ADB using one of the connection strings from tnsnames.ora as the DSN in SQLAlchemy as below:

import cx_Oracle
from sqlalchemy import create_engine
import sys
import os

if sys.platform.startswith("darwin"):
    cx_Oracle.init_oracle_client(
        lib_dir=os.environ.get("HOME")+"/Downloads/instantclient_19_8",
        config_dir="")
elif sys.platform.startswith("win"):
    cx_Oracle.init_oracle_client(
        lib_dir=r"C:\\Program Files\\Oracle\\instantclient_19_11")
# else assume system library search path includes Oracle Client libraries
#  On Linux, use ldconfig or set LD_LIBRARY_PATH, as described in installation documentation.

username = "admin"
# set the password in an environment variable called "MYPW" for security
password = os.environ.get("MYPW")
dsn = "scrajudb_high"

engine = create_engine(
    f'oracle://{username}:{password}@{dsn}/?encoding=UTF-8&nencoding=UTF-8', max_identifier_length=128)

with engine.connect() as conn:
    print(conn.scalar("select sysdate from dual"))

We use SQLAlchemy's create_engine() function to create an engine object that will handle the Oracle ADB connection. The above program queries the database for the current date.

Connecting to Autonomous Database using Connection Pooling in SQL Alchemy

Oracle DB's Python library cx_Oracle lets applications create and maintain a pool of connections to the database. Connection pooling is essential for performance when applications frequently connect and disconnect from the database. The pool implementation uses Oracle’s session pool technology which supports Oracle’s high availability features and is recommended for applications that must be reliable. This also means that small pools can be helpful in applications that require a few connections available for infrequent use.

The Wallet location and files need to be set and ready, as mentioned in the previous section.

A connection pool is created by calling SessionPool() function. This function is generally called during application initialization. The initial pool size, maximum pool size, and increment parameters are provided at the time of pool creation. 

Once the Session Pool has been created, we use the SQLAlchemy's create_engine() function (with pool.acquire passed as a value to the creator parameter) to create an engine object that will handle the pooled connections to Oracle ADB.

import cx_Oracle
from sqlalchemy import create_engine
from sqlalchemy.pool import NullPool
import sys
import os

if sys.platform.startswith("darwin"):
    cx_Oracle.init_oracle_client(lib_dir=os.environ.get(
        "HOME")+"/Downloads/instantclient_19_8")
elif sys.platform.startswith("win"):
    cx_Oracle.init_oracle_client(
        lib_dir=r"C:\\Program Files\\Oracle\\instantclient_19_11")
# else assume system library search path includes Oracle Client libraries
# On Linux, use ldconfig or set LD_LIBRARY_PATH, as described in installation documentation.

# Create the pool once at startup
# --------------------------------------
# ADB Connection
username = "admin"
# set the password in an environment variable called "MYPW" for security
password = os.environ.get("MYPW")
connect_string = "scrajudb_high"

# --------------------------------------
# With cx_Oracle 8.0, the encodings default to UTF-8.
# With cx_Oracle 8.2, the threaded parameter is ignored and threading is always used.
# In summary, you can simplify the call with the latest version
# --------------------------------------
pool = cx_Oracle.SessionPool(user=username, password=password, dsn=connect_string,
                             min=4, max=4, increment=0, threaded=True,
                             encoding="UTF-8", nencoding="UTF-8")


engine = create_engine("oracle://", creator=pool.acquire,
                       poolclass=NullPool, max_identifier_length=128)

# Then each 'user' will use a pooled connection

with engine.connect() as conn:
    print(conn.scalar("select sysdate from dual"))

The above program queries the database for the current date.

That should get you up and running with Oracle ADB and SQLAlchemy!

No comments:

Must Watch YouTube Videos for Databricks Platform Administrators

  While written word is clearly the medium of choice for this platform, sometimes a picture or a video can be worth 1,000 words. Below are  ...