Only in DbVisualizer Pro
This feature is only available in the DbVisualizer Pro edition.
Instead of using the GUI to import data you can use client-side commands to import data, i.e @import. This enables you to use the DbVisualizer command-line interface to automate your imports and utilise other client-side commands such as @export, @mail, among others. Import data using the @import command supports the following formats:
xlsx or the legacy xls format
The same XML formats that can be exported with DbVisualizer
The same XML formats that can be exported with DbVisualizer
Importing CSV files supports a lot of configurations such as multi-symbol column separator, multi-line values, etc.
Importing fixed width text files
These are client-side commands for @import are:
|The command starts an import session|
|Set parameters for the import|
|Parse and convert the source data to an intermediary format, and analyze the data.|
|Identify the target table and what target columns should be used|
|Runs the export|
The import process is explained by the following figure.
The @import parse step (1) lays the foundation for the database import as it based on the source file format (csv, json, xlsx, etc), parses the data to an intermediary and internal file. This file is then analyzed in order to detect widths, data types and their sizes based on the data, row and cell references back to the source file, and a lot more.
The intermediary format stores the input data as Records associated with information from where the data originates.
Once the data has been parsed some basic tests are performed to see if the properties of the data is compatible with the target table properties. E.g. if the size of the input data will fit the targeted table columns. This done when when the @import target command is run (2).
Final step of an import is to execute (3-4) the actual operations towards the database to import the data. I.e. execute the INSERT statements.
Any failures in the import will include specification of where in the source file the invalid data originates.
Following is a complete example where a simple CSV file is imported to a Target table Expenses.
Importing data - Example
The example shows how data about fruits included in a CSV file fruits.csv is imported into a database table fruits.
The example shows a minimal example where the columns of the CSV file is mapped directly to the table columns.
The SQL needed to import the file
Running the import script in DbVisualizer
As seen in the screen-shot above 3 records were inserted and and 1 was skipped (the header).
Read further on in the guide for how you can tailor your import using the different parameters of the import commands.
The @import commands
The @import on command initialises the import session. When a client-side command import session is started an output folder is created where DbVisualizer generates data representing intermediate results of the import. The output folder is created under a root folder (importresults) in the DbVisualizer preferences directory. The name of the folder is generated to be unique.
These folders are automatically cleaned by DbVisualizer on regular basis.
Note: Normally there is no need to specify any parameters to the @import on command. Default values should be sufficient for most uses.
The root folder for parsing results and other temporary files produced by the import session.
Note: Any results produced in this directory will automatically be removed on regular basis.
|ImportResultDir||Automatically generated as a sub folder to |
The path to put the DbVisualizer import results. If the path is an absolute path the results are stored in this folder. If it's not an absolute path it is assumed that it's path relative to the ImportResultRoot
The folder is created when import is started.
If the folder already exists, the import fails unless the Clean parameter is also used.
If none of the ImportResultRoot/ImportResultDir is specified DbVisualizer will create a new uniquely named directory for every import session.
If true, the ImportResultDir is cleaned before import. If false import fails if directory exists.
No need of specifying this parameter if ImportResultDir is not specified.
|ContinueImportFile||Specified if you are continuing an import. See chapter about Continuing an Import.|
Import results are stored in the folder /tmp/myimport. Any existing folder /tmp/myimport will be deleted/cleaned.
Import results are stored in
PREFSDIR/importresults/newimport. If this directory exists the import will fail.
The @import set command takes a parameter name followed by an equal sign and a value, e.g.
parameter="value". You can use the following parameters, where ImportSource is the only required parameter. All names are case-insensitive. Note that you may use multiple @import set commands but the first one must include the parameter ImportSource setting the input data to import.
|BooleanFalseFormats||false, no, 0, off|
Comma separated string of values that should be interpreted as boolean false.
|BooleanTrueFormats||true, yes, 1, on||Comma separated string of values that should be interpreted as boolean true|
Can be used to override the auto detection by specifying a column delimiter.
The most common column delimiters are auto detected.
|Valid values are: String or Auto Detect. If only |
Having data quoted is needed if the data itself contains the delimiter use to separate columns, special characters such as tabs, multiline separators, etc.
Any character is valid to use. Some additional aliases can be used to make the values more clear.
Example 1: For the following
Example 2: For the following
Example 3: For the following
If you have quoted data in your import source and do not specify
|DateFormat||yyyy-MM-dd||See valid formats in Changing the Data Display Format document.|
|DecimalSeparator||.||The decimal separator to use when parsing decimal numbers|
|Encoding||As set in Tool Properties||The file encoding to use when parsing the file to import|
|ErrorIncludeStackTrace||false||true/false If true the Java stack trace of any exception will be included in error messages.|
Defines what to do for a formula cell where the cached value indicates an error. Possible values:
An id specifying the sheet id of a workbook when importing from xls/xlsx files. First sheet has id = 0
|ExcelSheetName||A name specifying the sheet name of a workbook when importing from xls/xlsx files.|
|FailOnConvertFailure||false||If true, the Import will fail if data conversion fails.|
|FailOnNoColumnsFoundFailure||false||If true, the Import will fail if we found no columns during continue import|
|FailOnParseFailure||false||If true, the Import will fail if we got a failure during parsing of the source data.|
The row index of the row where the Header starts. If set to a number x the StartRowOfData parameter is automatically set to x +1.
The default value 0 indicates that the source data has no header information.
A path to the file to import. Must be included in the first @import set command. The path is an absolute path or a relative path to the script location.
If a @cd command has been run before the @import set command a relative path is relative to the @cd directory
|MaxRows||-1||The Maximum row to parse/import|
|ShowNullAs||The value that should be considered as NULL. E.g. (null)|
If set to false the header is also imported.
The row index of the row where data starts. See also HeaderStartRow.
|ThousandSeparator||,||The thousand separator to use|
|TxtColumns||Used when fixed columns text files are imported. Example 0, 4. For detailed syntax of the TxtTrim parameter please see the example Importing fixed column width input data|
|TxtTrim||true||If true, the column data retrieved when importing fixed column text files is trimmed.|
|TimeFormat||HH:mm:ss||See valid formats in Changing the Data Display Format document.|
|TimeStampFormat||yyyy-MM-dd HH:mm:ss.SSSSS||See valid formats in Changing the Data Display Format document.|
We are starting to import data from row 5 of the source data file and skipping rows starting with "//".
The input data has header information at row 1. We are starting to import data from row 2. As
StartRowOfData is not explicitly set is automatically set to 2.
This command does all the parsing and analysing of input data.
The source data file is parsed. As a result result are stored in the location pinpointed by ImportResultDir.
Example output from the
@input parse command:
The information shows the number of parsed records along with the number of columns found. If the parsed file was a CSV file the used delimiter is printed.
For each column the following information is printed:
- INDEX: The index of the column
- NAME: If
HeaderStartRowwas specified and header information was extracted the extracted column name is printed.
- TYPE: The type of data found. The size declaration (E.g. 16) represents the longest string found.
- NULLABLE: If the column is nullable or not.
- FROM_ROW: From which row in the source file the data type (e.g. String) was determined. This number serves as a hint to investigate source data when an unexpected type is analyzed.
E.g. The column name BIRTHDAY in the source data indicates that this data should be a date. By investigating the source data at row 4 you may find the reason why the column was analyzed as String column.
This command is responsible of all preparation of the target table prior to import. This includes dropping, truncating, deleting from and creating the table.
When this is executed a check is done if the input data will actually fit the table. This is done by comparing of the analyze result with the specified target and column mapping. Depending on the parameters the check is performed at different occasions. Parameters for this command are:
|Catalog||The target table Catalog|
Specifies if data should be cleared before import. Values
|ClearTableMethod||Truncate or Delete.|
Specifies mapping of source columns to target columns. The default is to import the source columns to the target table column by index. First column in source is import in first column in table. To specify another order or to ignore certain columns use ColumnMapping parameter.
Source column can be identified by index starting with 0 or by its name.
The mapping also supports overriding of the type information of the input data column.
The syntax in this case, for a single column mapping is:
|ColumnMappingFile||A reference to a file containing the column mapping.|
|CreateTableSQL||The SQL needed to create the table to import too. This parameter or the |
|CreateTableSQLFile||A file reference to a file containing the SQL to create the table.|
|DropTableSQL||The SQL for dropping the table|
|FailOnDropFailure||false||If true, the Import will fail if the DROP table statement fails.|
|Schema||The target table schema to import to|
|A comma separated list of target column names for which validation shall not be done. Example: |
|A comma separated list of target column numbers for which validation shall not be done. Example: |
|A comma separated list of JDBC type names for which validation shall not be done. Example: |
|Table||The target table to import to|
true or false
If true object identifiers will be delimited.
Target table is the "MyTable" table. The table is dropped and recreated before import. We are mapping the first column of the input data to the target column "ID" and the third column to the target column "NAME".
As the table is dropped we need to supply the DDL/SQL for creation of the table.
The table is cleared before the import. The clear method is determined by DbVisualizer. For Databases supporting this Truncate is used.
Run the actual import.
true or false
Using batch import will significantly improve the import speed. Note though that batch import may not be supported by all databases or JDBC drivers. In error situations it is also a good idea to switch off batch import.
For every 100 (or specified) number of rows being inserted, DbVisualizer will run a commit.
|FailOnInsertFailure||false||If true, the Import will fail if an INSERT statement towards the database fails|
Run the import. Don't import using batch import
Selecting data to import and mapping columns
CSV File delimited by exclamation mark "!".
Note that the first column of the CSV file is the brand name (Volvo) of the car. The table we are importing to have the columns in opposite order model, brand. I.e. we need to Map the columns.
CsvColumnDelimiter="!" specifying that the data is delimited by the character '!'.
SkipRowsStartingWith="BMW" We are not importing BMW cars
ColumnMapping="0=brand,1=model" Column 0 of the CSV file is mapped to the brand column of the table. Column 1 is mapped to the model column.
The table carmodel content after import:
Overriding analysed type information
When an input file is parsed DbVisualizer analyses the data to determine data types of the input data. The algorithm for this is quite coarse. DbVisualizer does offer a way to override the analysed data type.
Note that the birthday of "Lotta" is "NoData" which is of course not a valid date. When DbVisualizer parses/analyses the data, it will come to the conclusion that the BIRTHDAY column is a String.
The result of @import parse will contain a table describing information about the data that was parsed.
As mentioned earlier you can see that column BIRTHDAY has been interpreted as a String. This was found examining row 4 (FROM ROW column is 4).
In connection with inserting this column in the database DbVisualizer would insert/set this as a string. This would result in total import failure and no rows would be inserted in the database.
This may be addressed by overriding the analysed type for BIRTHDAY (String) and set the type to date.
- DateFormat="EEE, d MMM yyyy"
Defining the format to be able to interpret the dates in the CSV file.
Note the 1(date)=birthday where we are mapping the column with index 1 to the target column birthday. The (date) part specifies that column 1 should be interpreted as a date.
The result of the import using the script above is that 3 rows are imported (August, Sven and Bert). The row representing Lotta is reported as a failure as Indicated below.
Importing fixed column width input data (TxtColumns parameter)
The SQL Script
The parameter TxtColumns parameter specifies the column character positions. In this case first column starts at character position 0 and the second column starts at character position 6.
The resulting imported table is
Note how "LEMON" is imported without proceeding blanks. This is because column values are trimmed (TxtTrim parameter default is true).
The TxtColumns parameter
The TxtColumns parameter supports a number of syntaxes as explained in the examples below.
An example when parsing a row "AAA BBB CCC"
Yields extracted columns
|0-2, 4-5||"AAA" "BB"|
|1-3, 8-9||"AA" "C"|
Omitting the end index (as in the SQL script above)
Yields extracted columns
|0, 4, 8|
( same as 0-3,4-7, 8-end of line)
|"AAA" "BBB" "CCC"|
(same as 1-7,8-9)
|"AA BBB" "C"|
Using the "+" sign
Yields extracted columns
0+3, 4+3, 8
|"AAA" "BBB" "CCC"|
|"AAA BBB" "CC"|
Importing Excel data
Note how row 2 column B has the value #DIV/0!. This value represents a case where a cell is a calculated using formula where the calculation is producing an error.
SQL Script to import
Specifying that the sheet named mydata is is the sheet to import.
Specifies that if we find a formula that produced an error we should skip the complete row.
Note that row 2 is not imported as we instructed by ExcelCellPolicyError="SKIP_ROW";
Continuing an export that has failed
If any of the input data cannot be imported DbVisualizer will keep track of this. This is done by storing the failed data in a specific errorRecords.drec file in the directory where the import process stores its intermediate results (See ImportResultRoot parameter).
Following is an example were the export fails. It also shows how to import the failed data again. Specifically, the first import fails as Clementine is a name that is too long to fit in the target table column.
First Import SQL Script
Note: the table definition for the fruit table defines the column name to be VARCHAR(6). The input data "Clementine" will not fit there.
ImportResultDir="/tmp/importContinueFirstImport" We get our results in this directory. Makes it easy to refer in the second import script.
SkipValidateJdbcTypes="VARCHAR" Tells DbVisualizer not to validate VARCHAR columns. This is done for the purpose of this example. If not specified, the import would stop before any data has been imported.
When running this import the Database used in the example (MySQL) will fail when the import tries to insert the data row 3 as Clementine will not fit the column. The Failure printed in the DbVisualizer Log for this looks something like:
Note that the source data is pinpointed as originating at row: 3.
The Table fruits content after import.
Second import SQL Script
The ALTER statement is dealing with the root cause why the import failed. The name column was too small.
Note that when continuing an import, the commands @import set and @import target is not specified. The settings and target from the old import is used.
ImportResultDir="/tmp/importContinueSecond": Specifying a separate directory for the results
UseImportFile="/tmp/importContinueFirstImport/Results/errorRecords.drec" Pinpointing the file containing the data that contains the data that could not be imported.
The Table fruits content after second import.
Testing an import - Dry Run
The client-side import offers a way to run the import script to perform all client side data validation without changing anything in the database. This is done using the @set dryrun command.
Note that when running the import, without dry run, the import may fail nevertheless due to checks on the database side. E.g. primary- or unique key constraint checks.
When running the script, no actual clearing of the table will be done as the parameter CleanData indicates. Nor does the @import execute lead to any rows being inserted in the database.
Since there is a @set dryrun command prior to the commands no changes to the database table will be performed.