d137930cd297e313ff94926f749fe7995d1958a8
compute/data-munging.md
| ... | ... | @@ -1,21 +1,21 @@ |
| 1 | 1 | # Data Munging |
| 2 | 2 | |
| 3 | -Double-space a file: |
|
| 3 | +### Double-space a file |
|
| 4 | 4 | `sed -G` |
| 5 | 5 | |
| 6 | -Add a newline after every two lines: |
|
| 6 | +### Add a newline after every two lines |
|
| 7 | 7 | `sed '0~2 s/$/\n/'` |
| 8 | 8 | |
| 9 | -Replace newlines with arbitrary text: |
|
| 9 | +### Replace newlines with arbitrary text |
|
| 10 | 10 | `awk '{ print }' RS='\n' ORS='arbitrary_text'` |
| 11 | 11 | |
| 12 | -Work with human-readable e.g. disk ^2 sizes: |
|
| 12 | +### Work with human-readable e.g. disk ^2 sizes |
|
| 13 | 13 | `numfmt --from=iec` and `numfmt --to=iec` |
| 14 | 14 | |
| 15 | -Search Oracle lobs for strings: |
|
| 15 | +### Search Oracle lobs for strings |
|
| 16 | 16 | `select whatever from atable where (dbms_lob.instr(lob_column,utl_raw.cast_to_raw('text_to_look_for')) > 0)` |
| 17 | 17 | |
| 18 | -FPAT dirty quoted CSV awk: |
|
| 18 | +### Quick n dirty CSV awk |
|
| 19 | 19 | ``` |
| 20 | 20 | BEGIN { |
| 21 | 21 | FPAT = "([^,]+)|(\"[^\"]+\")" |
| ... | ... | @@ -23,4 +23,5 @@ BEGIN { |
| 23 | 23 | ``` |
| 24 | 24 | (doesn't handle newlines in quotes or escaped quotes though) |
| 25 | 25 | |
| 26 | -Paginated json coming back from a rest api? Use a Generator to return it. |
|
| 26 | +### Paginated REST results |
|
| 27 | +Paginated json coming back from a rest api? Use a python generator to return it. |