Iqbal´s DLQ main Help

File Diff with SQLite Shell

While working on OkHttp3 Scripting

I needed to do a diff from the original input file, to successful cases

and rerun again for failed queries.

this is the use case we are handling here:

original input file

input.txt

xxxx;B7;C7 xxxx;B12;C12 xxxx;B29;C29 xxxx;B15;C15 xxxx;B41;C41 xxxx;B58;C58 xxxx;B8;C8

We get some output in the form of

success.txt

xxxx;B7;C7 xxxx;B12;C12 xxxx;B29;C29 xxxx;B15;C15

failure.txt

xxxx;B41;C41 xxxx;B58;C58 xxxx;B8;C8

what we need is to only process (input - success) input.txt

xxxx;B41;C41 xxxx;B58;C58 xxxx;B8;C8

this an A-B set operation that SQL handles well:

sql-join.png

so let´s make it into one :D

SQLite Shell

SQLite Shell is perfect for this operation, I´ve tested it at least on a 50K+ records file and it returns almost immediately :

get SQLite binaries work on a single directory if you wish

sqlite.png

SQL file to run:

diff.sql

create table file1(line text); create index if1 on file1(line ASC); create table file2(line text); create index if2 on file2(line ASC); -- comment: if you have | in your files then specify “ .separator ××any_improbable_string×× ” .import 'success.txt' file1 .import 'input.txt' file2 .output input.txt select * from file2 where line not in (select line from file1); .q

Check the paths to your files and run .read diff.sql in the shell:

➜ sqlite git:(master) ✗ sqlite3 ➜ sqlite git:(master) ✗ sqlite3 SQLite version 3.42.0 2023-05-16 12:36:15 Enter ".help" for usage hints. Connected to a transient in-memory database. Use ".open FILENAME" to reopen on a persistent database. sqlite> .read diff.sql sqlite>

input.txt is now overwritten with only the remaining elements to process:

input.txt

xxxx;B41;C41 xxxx;B58;C58 xxxx;B8;C8

You can rerun your script against failures,

Some can be due to network issues, timeouts that can happen in OkHttp3 Scripting, etc.

those are worth the retry :)

Last modified: 12 March 2024