Easysoft ODBC-Apache Spark Driver User's Guide - Configuration

Configuring the Easysoft ODBC-Apache Spark Driver

The Easysoft ODBC-Apache Spark Driver is installed on the computer where your applications are running. ODBC applications access ODBC drivers through the ODBC Driver Manager and a data source. The data source tells the Driver Manager which ODBC driver to load, which Spark Thrift server to connect to and how to connect to it. This chapter describes how to create data sources, use DSN-less connections and configure the Easysoft ODBC-Apache Spark Driver.

Before setting up a data source, you must have successfully installed the Easysoft ODBC-Apache Spark Driver.

For Easysoft ODBC-Apache Spark Driver installation instructions, see Installation.

Chapter Guide

Configuring the Easysoft ODBC-Apache Spark Driver

This section describes how to configure the Easysoft ODBC-Apache Spark Driver to connect to Apache Spark via a Spark Thrift server by using a data source or a DSN-less connection string. The section assumes you are, or are able to consult with, a database administrator.

Setting Up Data Sources on Unix

There are two ways to set up a data source to your Apache Spark data:

¯ OR ¯

By default, the Easysoft ODBC-Apache Spark Driver installation creates a SYSTEM data source named [SPK_SAMPLE]. If you are using the unixODBC included in the Easysoft ODBC-Apache Spark Driver distribution, the SYSTEM odbc.ini file is in /etc.

If you built unixODBC yourself, or installed it from some other source, SYSTEM data sources are stored in the path specified with the configure option --sysconfdir=directory. If sysconfdir was not specified when unixODBC was configured and built, it defaults to /usr/local/etc.

If you accepted the default choices when installing the Easysoft ODBC-Apache Spark Driver, USER data sources must be created and edited in $HOME/.odbc.ini.


Note

To display the directory where unixODBC stores SYSTEM and USER data sources, type odbcinst -j.

By default, you must be logged in as root to edit a SYSTEM data source defined in /etc/odbc.ini.


You can either edit the sample data source or create new data sources.

Each section of the odbc.ini file starts with a data source name in square brackets [ ] followed by a number of attribute=value pairs.


Note

Attribute names in odbc.ini are not case sensitive.


The Driver attribute identifies the ODBC driver in the odbcinst.ini file to use for a data source.

When the Easysoft ODBC-Apache Spark Driver is installed into unixODBC, it places an Easysoft ODBC-Spark entry in odbcinst.ini. For Easysoft ODBC-Apache Spark Driver data sources therefore, you need to include a Driver = Easysoft ODBC-Spark entry.

To configure a SugarCRM data source, in your odbc.ini file, you need to specify the Spark Thrift server and authentication details.

For example:

 [SPARK_SAMPLE]

 Driver=Easysoft Apache Spark ODBC

 Description=Easysoft Apache Spark ODBC driver

 Server=mythriftserver

 Port=10000

 Logging=0

 LogFile=

 Authentication=None

Environment

The Easysoft ODBC-Apache Spark Driver must be able to find the following shared objects, which are installed during the Easysoft ODBC-Apache Spark Driver installation:

By default, this is located in /usr/local/easysoft/unixODBC/lib.

By default, this is located in /usr/local/easysoft/lib.

By default, this is located in /usr/local/easysoft/lib.

You may need to set and export LD_LIBRARY_PATH, SHLIB_PATH or LIBPATH (depending on your operating system and run-time linker) to include the directories where libodbcinst.so, libeslicshr.so and libessupp.so are located.


Note

The shared object file extension (.so) may vary depending on the operating system (.so, .a or .sl).


Establishing a Test Connection

The isql query tool lets you test your Easysoft ODBC-Apache Spark Driver data sources.

To test the Easysoft ODBC-Apache Spark Driver connection

1.  Change directory into /usr/local/easysoft/unixODBC/bin.

2.  Type ./isql.sh -v data_source, where data_source is the name of the target data source.

3.  At the prompt, type an SQL query. For example:

 SQL> select * from MyTable;

¯ OR ¯

 Type help to return a list of tables:

 SQL> help

 

Setting Up Data Sources on Windows

To connect an ODBC application on a Windows machine to a Apache Spark server:

1.  Open ODBC Data Source Administrator:

 The ODBC Data Source Administrator dialog box is displayed:

2.  Select the User DSN tab to set up a data source that only you can access.

¯ OR ¯

 Select the System DSN tab to create a data source which is available to anyone who logs on to this Windows machine.

3.  Click Add... to add a new data source.

 The Create New Data Source dialog box displays a list of drivers:

4.  Select Easysoft ODBC-Apache Spark Driver and click Finish.

 The DSN Setup dialog box is displayed:For details of the attributes that can be set on this dialog box, see Attribute Fields.


64-bit Windows

The Easysoft installer program installs both a 32-bit and a 64-bit version of the Easysoft ODBC-Apache Spark Driver. If you want to use a 64-bit ODBC application, you need to use the 64-bit Easysoft ODBC-Apache Spark Driver. If you want to use a 32-bit ODBC application, you need to use the 32-bit Easysoft ODBC-Apache Spark Driver.

There is both a 32-bit and a 64-bit version of ODBC Administrator. The 64-bit ODBC Administrator is located in Control Panel under Administrative tools. To access the 32-bit ODBC Administrator in Windows 7 and earlier, in the Windows Run dialog box, type:

%windir%\syswow64\odbcad32.exe

On Windows 8 and later, both the 32-bit and 64-bit ODBC Administrator are located in Control Panel under Administrative tools: ODBC Data Sources (32-bit) and ODBC Data Sources (64-bit).

Easysoft ODBC-Apache Spark Driver data sources created in the 64-bit ODBC Administrator will specify the 64-bit version of the Easysoft ODBC-Apache Spark Driver. Easysoft ODBC-Apache Spark Driver data sources created in the 32-bit ODBC Administrator will specify the 32-bit version of the Easysoft ODBC-Apache Spark Driver.

If you want to create an Easysoft ODBC-Apache Spark Driver System data source for use with a 64-bit application, use the 64-bit ODBC Administrator. If you want to create an Easysoft ODBC-Apache Spark Driver System data source for use with a 32-bit application, use the 32-bit ODBC Administrator.

For Easysoft ODBC-Apache Spark Driver User data sources, it does not matter which version of the ODBC Administrator that you use.


Attribute Fields

This section lists the attributes which can be set for the Easysoft ODBC-Apache Spark Driver in a table showing:

Attributes which are text fields are displayed as value.

Attributes which are logical fields can contain either 0 (to set to off) or 1 (to set to on) and are displayed as "0|1".

If an attribute can contain one of several specific values then each possible entry is displayed and separated by a pipe symbol.

For example, in the statement:

DIALECT=1|2|3

the value entered may be "1", "2" or "3".

DSN

The name of the User or System data source to be created, as used by the application when calling the SQLConnect or SQLDriverConnect functions.

Interface Value

DSN Dialog Box (Windows)

DSN

odbc.ini file (Unix)

[value]

Connect String

DSN=value

Description

Descriptive text that may be retrieved by certain applications to describe the data source

Interface Value

DSN Dialog Box (Windows)

Description

odbc.ini file (Unix)

Description=value

Connect String

Not Used

.

Server

The host name or IP address of the machine on which your Spark Thrift server is running.



Interface Value

DSN Dialog Box (Windows)

Server

odbc.ini file (Unix)

Server=value

Connect String

SERVER=value

User Name

If applicable to the chosen Authentication method, the user name (or LDAP DN) required to gain access to the Spark Thrift server.



Interface Value

DSN Dialog Box (Windows)

User Name

odbc.ini file (Unix)

User=value

Connect String

UID=value

Password

The password for User Name.



Interface Value

DSN Dialog Box (Windows)

Password

odbc.ini file (Unix)

Password=value

Connect String

PWD=value

Port

The port on which the Spark Thrift Server is listening. For non-HTTP Thrift server transports, the default port is 10000. For HTTP Thrift server transports, the default port is 10001.



Interface Value

DSN Dialog Box (Windows)

Port

odbc.ini file (Unix)

Port=port

Connect String

PORT=port

Varchar Len

The length that the Easysoft ODBC-Apache Spark Driver reports for varchar columns. If you are using the driver under Oracle, and get the error "illegal use of long data type", try setting this attribute to 8000. If you are using the driver under SQL Server and get the error "requested conversion is not supported" try setting this attribute to 2048.



Interface Value

DSN Dialog Box (Windows)

Varchar Len

odbc.ini file (Unix)

Col_Length=num

Connect String

COL_LENGTH

Encrypt

Whether to encrypt date passed over the communications channel between the Easysoft ODBC-Apache Spark Driver and the Spark Thrift server.



Interface Value

DSN Dialog Box (Windows)

Encrypt

odbc.ini file (Unix)

Encrypt= YES | NO

Connect String

ENCRYPT

TrustServerCert

Whether to bypass validation of the certificate used by the Spark Thrift server. This setting is only applicable if Encrypt is set to Yes. If TrustServerCert is set to No, you need to specify the path to the Thrift server certificate with the Server Cert attribute.



Interface Value

DSN Dialog Box (Windows)

TrustServerCert

odbc.ini file (Unix)

TrustServerCertifcate=Yes | No

Connect String

TRUSTSERVERCERTIFICATE=Yes | No

Server Cert

The certificate used by the Spark Thrift server to encrypt connections to it.



Interface Value

DSN Dialog Box (Windows)

Server Cert

odbc.ini file (Unix)

CertificateFile=path

Connect String

CERTIFICATEFILE=path

Authentication

If your Spark Thrift server's hive.server2.authentication attribute is set to NONE, set Authentication to NONE.

If your Spark Thrift server's hive.server2.authentication attribute is set to NOSASL, set Authentication to NOSASL.

If your Spark Thrift server's hive.server2.authentication attribute is set to LDAP, set Authentication to LDAP.

If your Spark Thrift server's hive.server2.authentication attribute is set to Kerberos and the Thrift server uses a non-Windows KDC, set Authentication to Kerberos.

If your Spark Thrift server's hive.server2.authentication attribute is set to Kerberos and the Thrift server uses a Windows AD KDC, set Authentication to AD.

If your Spark Thrift server's hive.server2.transport.mode attribute is set to http, set Authentication to HTTP_BASIC.

If your Spark Thrift server's hive.server2.transport.mode attribute is set to http and uses an access token based authentication scheme (for example Databricks), set Authentication to HTTP_OAUTH.



Interface Value

DSN Dialog Box (Windows)

Access Token

odbc.ini file (Unix)

AccessToken=value

Connect String

ACCESSTOKEN

Kerb Principle

The Kerberos principle for the Spark Thrift server. This setting is only relevant if Authentication is set to Kerberos or AD.



Interface Value

DSN Dialog Box (Windows)

Kerb Principle

odbc.ini file (Unix)

KrbPrinciple=value

Connect String

KRBPRINCIPLE=value

Http Uri

If you are using an HTTP-basedThrift server transport, set this attribute to the HTTP endpoint for the Spark Thrift server. For example, cliservice.



Interface Value

DSN Dialog Box (Windows)

Http Uri

odbc.ini file (Unix)

HttpUri=URI

Connect String

HTTPURI=URI

Access Token

If your Spark Thrift server's uses an access token based authentication scheme (for example Databricks) specify the token with this attribute.



Interface Value

DSN Dialog Box (Windows)

Access Token

odbc.ini file (Unix)

AccessToken=value

Connect String

ACCESSTOKEN=value

DSN-less Connections

In addition to using a data source, you can also connect to a database by using a DSN-less connection string of the form:

SQLDriverConnect(..."DRIVER={Easysoft Apache Spark ODBC Driver};

SERVER=mythriftserver;PORT=10000;AUTHENTICATION=NONE"...)

You need to use the Easysoft Apache Spark ODBC Driver (Windows) or Easysoft Apache Spark ODBC (Linux) DRIVER keyword to identify the Easysoft ODBC-Apache Spark Driver.