2022.06.29 hackerrank bigsorting python> export OUTPUT_PATH=output python> cat input | python3 bigsorting.py; cat output intro insertionsort1 insertionsort2 sortloopinvariant runningtime quicksort1 countingsort1 countingsort2 closestnumbers findthemedian insertionsortadvancedanalysis -- abandoned: couldn't find an o(nlogn) algorithm Tried to use a bitarray. Works, but does not solve the o(n2) issue. Abandoned again (without completing). kept input as input.isaa 2022.07.01 fraudulentactivitynotifications brute force: too slow using the counting sort: some test errors, and one time limit Suspicion: multiple values near the median: fixed Remains one time limit... test 5 200000 40001 Strange: same size as other one passing... odd median (simpler) larger window (4x) -- significant difference! python> time bash -c "cat input | python3 fraudulentactivitynotifications.py; cat output" 926 real 0m10.639s user 0m10.632s sys 0m0.009s python> time bash -c "cat input | python3 fraudulentactivitynotifications.py; cat output" 633 real 0m4.583s user 0m4.575s sys 0m0.011s Relatively many '100', so that as a median, it is quite stable OK: I had left one extra calculation, useful only for debugging It drove the time consumption and was proportional to d This is a task I failed to complete in go! I zip the code because of the 2 examples with 200000 items -- still ~0.5M 2022.07.04 lilyshomework 7 12 13 15 3 15/3 -> 7 12 13 14 3 15 14/3 -> 7 12 13 3 14 15 13/3 -> 7 12 3 13 14 15 12/3 -> 7 3 12 13 14 15 7/3 -> 3 7 12 13 14 15 bubbling inefficient, as: 7 12 13 15 3 7/15 -> 15 12 13 7 3 12/13 -> 15 13 12 7 3 but this requires to know the goal (i.e. one has to run a sort in advance) Other case: 7 12 15 3 7/12 -> 12 7 15 3 7/15 -> 12 15 7 3 12/15 -> 15 12 7 3 3 15/3 -> 7 12 3 15 12/3 -> 7 3 12 15 7/3 -> 3 7 12 15 3 7/15 -> 15 12 7 3 1 -- just noticing that both 12 and 3 already hold their final positions Note: n *distinct* integers The graph will now contain many non-intersecting cycles. Now a cycle with 2 nodes will only require 1 swap to reach the correct ordering, similarly, a cycle with 3 nodes will only require 2 swaps to do so. Strategy: 1. counting sort -> bitarray. 2. navigate identifying 2 sets of cycles, one for each sorting direction in each set: a) the values already in-place b) the values forming chains, as maps, gradually combining the chains into cycles 3. compute the number of swaps for every cycle, and sum 4. pick the minimum for the two sets 1 2 3 5 0111 0100 7*16 + 4 116 22.07.06 One only needs to count the length of the cycles. ModuleNotFoundError: No module named 'bitarray' OK, same thing with plain int... 5 time limits, 3 runtime errors, 4 successes For test case 5 (rt error), with 100000 items, the count sort loop doesn't complete on the laptop! So... bitarray was useful. I can try to use an int64 array instead. Much better, but still: now I should optimize sorting a strict cycle I.e. I have a solution of test case 5 with 1 cycle of length 10 The max is thus 10-1 = 9 -- no: neither the max nor the min But in this case, I find solutions in 8, but the expected result is 5... I used n-1 as nr of swap for a cycle of length n Using (n+1)//2, I get 4 wrong answers (1-3, 6) The problem is that it feels like the actual value depends on the order, not only on the length. Test case 1 has 274 items, for an expected output of 268 My answer is 137!? The fwd cycle is of length 273, the bwd one 274 137*2 = 274 The difference between the two cycles is that 275364045 is in the second one only (index 181 -- correct) -July 17------------------------------- I found that the cycles I had identified were in fact not correct. So, the n-1 count for a cycle may be correct, and it would yield 5 for 10out of order entries, if these would in fact constitute 5 2-cycles. Idem 8 is correct if they build up 2 5-cycles. Debugging with the 100000 item example (instrumenting the reverse check in the 2nd run: n, idx[n]): python> cat input | python3 lilyshomework.py; cat output fwd: 100000; rev: 10 0 python> cat input | python3 lilyshomework.py; cat output 1861185000 99997 1944242900 100000 1849599104 99993 1850865170 99995 2195634 2986 142461855 50134 236613600 63587 4401198 5140 121571388 46338 729655080 91374 fwd: 100000; rev: 10 0 and these values (e.g. 99997) are not themselves in idx (dictionary). -July 30------------------------------- Since last time, I have updated ubuntu, so that: venv> ll -d * drwxrwxr-x 6 marc marc 4096 Jun 15 18:00 p37 venv> python3 --version Python 3.8.10 I'll create a new venv (p37 was created for ~/jobs/zinkworks/assignment) venv> sudo apt update venv> sudo apt upgrade -y venv> sudo apt install software-properties-common -y venv> sudo add-apt-repository ppa:deadsnakes/ppa venv> sudo apt install python3.10 python3-venv python3.10-venv venv> python3.10 --version Python 3.10.5 venv> python3.10 -m venv p310 venv> . ./p310/bin/activate (p310) venv> python --version Python 3.10.5 (p310) venv> pip install --upgrade tensorflow (p310) venv> python -m pip install --upgrade pip (p310) venv> python Python 3.10.5 (main, Jun 11 2022, 16:53:24) [GCC 9.4.0] on linux Type "help", "copyright", "credits" or "license" for more information. >>> import tensorflow as tf 2022-07-30 09:22:41.401892: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0';y 2022-07-30 09:22:41.401914: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on. >>> quit()