Database

Ten wpis jest kontynuacją wpisu “Praca domowa” SysAdmina – part 1: disk baselines - z ciekawszym zadaniem polegającym na zoptymalizowaniu ładowania sporych plików csv do wybranej bazy danych jak najszybciej. Punkt wyjściowy - copy extract from STDIN WITH delimiter E'\t' NULL AS "; tl;dr? Podsumowanie jest na końcu artykułu 😉 Intro notes # Disk benchmarking for ETL is done in file for task 1. Following sections are in chronological order of testing. Average of 2 most consistent results were chosen in perf tests, over 4-6 performed awk’s and plotly.js average value are different since rounding is performed on different stages of computation krps is k rows/sec, 1000x amount of rows inserted per second Environment setup # Initial filesystem for /home is ext4 journalled. Below is setup log