Thursday, August 22, 2019

Convex hull

In the previously posted flownet generation program, there is a seemingly insurmountable problem. After equipotential points for a given head have been extracted from between the solved finite difference grid points (here, using linear interpolation) comes the hard part: properly joining them. I have outlined how I approached the problem in the first post about the program. There, I did point out that in equipotential points for a given head, discovered to the extreme left and right of the grid, a jumpy line springs into existence because of the limited space the grid actually represents which obviously clips the isobars. However, at the time, I hadn't realized that there was an even more sinister problem with the program and the 'connect the dots' logic - one which doesn't, at this point, seem solvable because of the weird ways in which these equipotential lines naturally seem to be able to bend. Take for example the following output with
contourInterval=5
numDys=13
numDxs=16
DxsTillDam=2
DamWidthInDxs=4
reservoirHead=100
downstreamBoundaryHead=50
The problem is visible for the 15m isobar. Starting from underneath the dam, the point near the 16.29m grid node should have been the one to be joined first rather than the point between the 11.06m and 19.26m grid nodes. It is easy enough for us to say via a simple visual inspection. But the 'order in descending Y coordinate and then by shortest consecutive distance' logic falls flat in this case since clearly, the program is doing exactly that. The point actually being joined is clearly closer to the equipotential point on dam than the 'right' point which happens to be a little too left.
I thought of trying all orderings of the points to minimize total length but quickly concluded that it would be computationally unviable - the number of equipotential points for a head regularly crossed 20 when I was testing the program - that's 20!, or about 10^18, number of combinations to try for drawing each isobar. What's more, I wasn't sure it would even produce the desired result. So, I scrapped the idea.

Then, I researched ways to do this online. Something called 'Convex hull' came up. Basically, it is a way of drawing a polygon around a bunch of points while passing through the right subset of the said points so that no point is left outside the polygon. The trick is to test each possible line segment (defined by two unique points) for convexity i.e. whether all other points lie towards the same side (of the line segment); if they do, the line segment being tested is part of the polygon - the 'convex hull'. The idea behind this test - whether a point lies toward one side of a line or the other - comes from checking the point's y-coordinate against that the line's, at the point's x-coordinate. If all the points lie above (the idea of a point being 'above' a line can be extended from the obvious case where the line is horizontal, right until it isn't completely vertical) the line, their y-coordinates are going to be greater than the y-coordinates of the line segment at their(the points') x-coordinates. Anyway, this thread's first answer is what I implemented - but not into the Fortran program itself. I was skeptical you see. I couldn't quite picture how this thing worked or what it would look like in action. So, I first made a small JavaScript program using p5.js so that I could have something concrete to look at and potentially tweak.
Here is the code for it.
Then, it became clear that no amount of tweaking the convex hull algorithm could manage to connect the equipotential points 'properly'. There would be too many exceptions - non-convex points that would need to be connected, but would not be. Leaving such points out would produce smooth and nice equipotential lines but that would also be misleading and far from accurate.
So, there's where I basically left the flownet program, without progress.
But at least I got to learn something new in the process.

Monday, August 12, 2019

Flownet update

This is a revised version of the last flownet generation program.

Changes:

  • Head at dam heel is equal to the impounded water depth as should be(was assumed 0 before)
  • Dam was fixed to be the size of a single grid box. Now, width of dam can be specified
  • Dam's heel and toe are Dirichlet boundaries i.e. known heads but any node between are treated as unknowns
  • Better code documentation during coefficient matrix construction.
The code file is 'Source1V2.f90'
My sketches for this revision.
An associated Excel spreadsheet for control.

Some more screenshots:
Flownet of the original problem with numDxs=4,numDys=3,DxsTillDam=2,DamWidthInDxs=2,reservoirHead=100,downstreamBoundaryHead=50 and contourInterval=10
Flownet of a problem with numDxs=5,numDys=5,DxsTillDam=2,DamWidthInDxs=3,reservoirHead=100,downstreamBoundaryHead=50 and contourInterval=10

Flownet of a problem with numDxs=15,numDys=15,DxsTillDam=5,DamWidthInDxs=3,reservoirHead=100,downstreamBoundaryHead=50 and contourInterval=5
The weird aggregation of isobars at the right(downstream) boundary between 0 and 50m is because of the type of boundary condition given. A constant head (50m) Dirichlet boundary in the vertical direction is unrealistic. An impermeable vertical plane (Neumann boundary) would have allowed a smooth and probably more realistic variation of head downstream.

Sunday, August 11, 2019

Flownet Generation in Fortran

This was a 'challenge' assignment (at least that's how I perceived it) for us by our Fortran Simulation Lab professor. We were given a dam impounding 100m deep water. We had to solve the Laplace's equation in 2D using the 'implicit scheme' - basically solving a system of linear equations consisting of Laplace's finite difference equation for each unknown node. For the finite differencing domain, 4 grids along the horizontal and 3 grids along the vertical (each grid being a square of unspecified dimensions) were provided. We were also given Dirichlet boundary conditions on the nodes at the downstream end. When I first set out on manually solving the problem as a control for when actually developing code for it, I thought that the heel of the dam should have a head equal to the impounded water's depth, as all points upstream on the ground surface should - and still do think so. But that is how the problem was posed and so I ran with it.
The problem
I wanted to make the code 'general' in that it could work for other scenarios as well - not for just this specific combination of parameters. So, I decided to allow for 6 inputs: contourInterval, numDys, numDxs, DxsTillDam, reservoirHead, downstreamBoundaryHead. reservoirHead and contourInterval should be pretty obvious. numDxs and numDys are just that - total grids/boxes to work with along horizontal and vertical axes. DxsTillDam is the number of grid boxes to the left of dam's heel. downstreamBoundaryHead is the head at the downstream boundary. Based on these parameters, I had to think hard on what the coefficient matrix for the system of linear equations would look like. Since the matrix has a lot of structure in it, it was not that big of a problem. I just looked at the matrix I had prepared while working out the problem by pen and paper and coded in the logic to populate a 2D array with the exact same rule that the elements of the original problem's matrix followed. Another 2D array that functioned as a single column matrix i.e. vector representing the Dirichlet boundary conditions was created side by side while populating the coefficient matrix.
Matrix for reference and Manually solved grid for reference
Mazumder and Wen Shen were indispensable for understanding the matrix formulation.
The next step was to find the points on the grid lines that had the same head for each head. After some thinking, I settled on the idea of sweeping from the left boundary towards the right upto the right boundary(-1 to be precise), one column at a time, looking for a specific head to the immediate right and to the immediate above standing at each node moving up the column to top(-1 to be precise). Since the head being searched for rarely exist on the grid nodes themselves, I interpolated between the current node and the east node, and the current node and the north node to get the exact coordinates of the head - provided it existed between the nodes. I stored the interpolated coordinates off to two matrices. One matrix for the X coordinate, another for Y. In each matrix, each column corresponded to a head, viz. a column for 10m, another for 30m, etc.
Now that I had the equipotential coordinates/points for each head in question, all that was left was to join them. I first tried joining them in the order of their discovery i.e. according to their positions in the coordinate matrices. That didn't work very well as anyone might've guessed. Order matters when joining a bunch of equipotential points. You don't want jigjagging lines going back and forth for what should be a smooth isobaric curve. I then tried joining the points from the top down - basically sorting them in descending order of their Y coordinate. But that didn't work very well either. Since the isobars tend to be U-shaped, a lot of times, this method led to left-right-left jigjags. I then thought of joining the points from left to right. This worked for majority of the isobars. But it almost always messed up one isobar - right in the middle, when the curves shift from left facing to right facing. The isobars to the left and to the right are 'proper' functions i.e. unique Y for any given X. However, at the transition zone, between the two kinds - which happens to be right at the center of the base of dam(not here though, for reason described towards the top of this article) or more precisely at the ground level right between nodes of reservoirHead and 0 - the isobaric curve is somewhat vertical and hence exhibits 'ill' behavior as a function i.e. two or more Y's at some X. It is precisely due to the existence of such isobars that the left to right joining method fails towards the center of the grid. I finally ended up adopting a hybrid technique: joining from top to bottom but with shortest distance to the immediately next bottom point consideration. This involved sorting the equipotential points for a head in descending order of Y coordinate and then sorting by shortest distance between any two consecutive points. This gave better results but for some unknown reason, which may or may not be related to this technique, bottom most isobars starting making jump between left and right. I have not researched too much into this issue but it may just be how flownets are: loop-like and when a loop gets clipped due to our domain boundary, there's bound to be equipotential points to the right and to the left for a head with the bottom points clipped. In such a case, this 'jumpy' behavior is exactly what would be expected. Anyway, I just wrote a couple of anti-jump lines to prevent joining points farther apart than a specific X and Y distance.
My scribbles: 1 2 3
The viewport width and starting position(top left coordinates) are by default set to 600, 200, 8. Sometimes a larger viewport width may be desirable. The user may freely change these without fear of messing up the grid, irrespective of problem-centric parameters listed in the second paragraph.
Code
Flownet with reservoirHead=100, numDxs=4, numDys=3, contourInterval=10, downstreamBoundaryHead=50 and DxsTillDam=2

Flownet with reservoirHead=100, numDxs=20, numDys=20, contourInterval=10, downstreamBoundaryHead=50 and DxsTillDam=4

Saturday, June 8, 2019

Dam-break simulation

This is a JavaScript based dam-break problem simulator. Something I am studying as part of Advanced Hydraulics curriculum. It's basically just one equation relating x, t and depth y.
Wanted to make something tangible. So, I did this. Just for fun. Makes use of Chart.js and Web Worker. Web Worker script requires to be kept in a separate .js file. Chrome doesn't allow html files to access local filesystem so, open using Firefox or use the flag --allow-file-access-from-files while starting Chrome and then open the html file. Code
EDIT:
This is after I've realized what GitHub Pages is.
Live demo here
Corresponding repo here
Future updates, if any, will be made to the repo, not the Gist

Thursday, May 30, 2019

Standard-form Linear Programming Problem solver with Simplex method

This is just a JavaScript implementation of the Simplex method for solving standard-form Linear Programming problems. Part of Operations Research.
EDIT:
This is after I've realized what GitHub Pages is.
Live demo here
Corresponding repo here
Future updates, if any, will be made to the repo, not the Gist

Mein and Larson Infiltration model

I actually did this about a week or two back. I was on the fence regarding whether I should put it here on my blog. Finally decided to.
This is not a full-fledged utility program. Just something I did for myself. Part of my M.Sc. Water Resource Engineering curriculum in Advanced Hydrology is the Mein and Larson model of infiltration. This is just a visualization of the output of that model using chart.js.
Professors require us to use Fortran but come on, in 2019, when it comes to utility, functionality, versatility, whatever you call it, nothing beats JavaScript. There's almost always some library for what you do that makes you want to use JavaScript. Here is the code.
EDIT:
This is after I've realized what GitHub Pages is.
Live demo here
Corresponding repo here
Future updates, if any, will be made to the repo, not the Gist

Geetify

I had thought this blog already had this post. Apparently, it doesn't. So, here it is.
Straight from my GitHub.
Geetify is an Android app that allows you to search for YouTube videos and download them as small audio files. I made it because there was no equivalent to Spotify or Gaana or Voot etc for Nepali songs. [Edit: Turns out, Spotify does have Nepali songs]
The downloaded audio files are a special kind, called DASH audio. DASH audio seems to be a 'streamable' audio type. YouTube using this type of audio makes sense. There's an .mp4 and a .webm format audio for (mostly) all videos on YouTube. Geetify prefers the .webm version since it seems to be more well-behaved for seeking to different times while playing and is in general smaller. Currently, there is no way to change that preference. Also, it automatically chooses the lowest bitrate audio because of the smaller size. While it may not be noticeable for most songs, sometimes, it is very conspicuous - one example is Castle of Glass by Linkin Park. There is no option to choose the desired bitrate either at this moment.
The heart of this project lies in the extraction of the audio links from YouTube videos and rendering those often 'signature-protected' links usable. This is why I looked around the internet for something like this and I found youtube-dl. I first made a Python program replicating the core functionality of youtube-dl and then simply ported it to Java for Android. Details and the inner workings of this process of extraction have been posted here as a separate post - Youtube audio downloader. Throughout the project, the aforementioned youtube-dl has been the go-to reference. Without it, this project would've been impossible.
Oh, almost forgot to mention - I originally created this back towards the beginning of 2018 if I remember correctly. I used a third party website to do the conversion for me. So it was very simple code back then compared to now. Then, all of a sudden, the third-party website stopped being easy going on the likes of me, who crafted HTML requests similar to the ones produced by real users on real web browsers, to get the songs programmatically. So, that was the whole reason for basically a replacement of the engine of the program, so to speak.
There are some bugs here and there but overall, the program is functional.
Here's the direct link to the apk.








Friday, April 12, 2019

Youtube audio downloader

This is a Python 3.7 script to download DASH audio files from YouTube.
I was trying to revamp my app: Geetify which used a third-party website to extract OGG audio from YouTube. I wondered if it would be possible to do it on my own. Sure enough, a quick google search led me here through this StackOverflow post. I honestly didn't think the task could be so straightforward. From the SO post and with youtube-dl's source on github, I gathered the following:
  • YouTube stores video and audio files separately and has a bunch of different formats (mp4, webm, audio, video, 1080p, 720p, 480p and so on) for each video
  • The links for these resources are right there in the page source of the corresponding video, specifically, this information is stored in a JSON config object seemingly to be used by a 'ytplayer' (YouTube's video player?)
    Source of a YouTube video page with ytplayer.config object
    ytplayer.config object from the developer console
  • The "args" key of this object holds a "player_response" key whose value is a JSON string.
    player_response string
    player_response object
  • The player_response object contains "streamingData" which in turn contains the key "adaptiveFormats" which is an array of objects each corresponding to different media formats of the YouTube video. Each of these objects contains useful information such as the type and size of the media, dimensions, duration etc and of course the direct URL to the media file.
    adaptiveFormats array
Once I figured out this structure, I quickly fired up my PyCharm IDE and implemented this extraction logic. All was fine except the download speed. The rate at which youtube-dl downloaded a media file from YouTube was wayyyy faster than the rate that my simple urllib.request.urlretrieve() did. It would take close to three minutes to download a 3 MB audio. I tried my code with other direct file download links. Curiously, the problem didn't exist outside of YouTube. I was getting close to my connection speed during the download. I was convinced it must be google throttling my download speed. A couple of SO posts further reinforced this suspicion. Then, I tried adding different headers to my download request having switched to the requests module for ease of handling headers and stuff. The logic was, google was probably flagging the download request as a bot/scraper's doing seeing that there was no additional header in the request as would have been present were it a human trying to watch a video. So, I added a couple of 'normal' headers :
{'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64; rv:59.0) Gecko/20100101 Firefox/59.0',
'Accept-Charset': 'ISO-8859-1,utf-8;q=0.7,*;q=0.7',
'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
'Accept-Encoding': 'gzip, deflate',
'Accept-Language': 'en-us,en;q=0.5'}
Needless to say, that didn't work. And I knew it was just a hit or miss anyway.
I researched more about how popular download managers such as Internet Download Manager perform these large downloads 'properly'.
I looked to see if there was any information in the response from the YouTube server regarding this and looked up a few of the response headers. Then somehow, I stumbled upon this and thought perhaps I should try the 'Range' HTTP header to specify a fixed range of data bytes of the file in each request until the entirety of the file had been downloaded. MDN was my reference. When I was done translating this to code, I hit run - and voila! It worked! The file downloaded almost instantly. So, it wasn't the lack of a user-agent header or some other anti-bot feature that was the problem but instead my failure to realize that the server wanted to provide a HTTP 206 Partial Content response to media file queries, which without me specifying a 'Range' in the request header wasn't going to happen.
Anyway, it was a good exercise and a good problem to solve. Now to Android Studio for the actual reason this started.
Code is available here

Edit:
ytplayer.config.args.player_response.streamingData doesn't always seem to have the key 'adaptiveFormats'. This path worked fine days ago but all of a sudden today, it just up and quit being there. I don't know what caused it but anyway, referring back to the original youtube-dl source from GitHub and with some Google Chrome Developer Tools fiddling, I discovered that ytplayer.config.args.adaptive_fmts holds essentially the same data in URL-encoded string format. The different adaptive format media info are delimeted by comma ','. This fact can be used to tokenize the string. Each token can further be split by ampersand '&' which delineates the different parameters pertaining to the individual media. The media size is represented by the key 'clen' and media type by 'type'. The direct link to the media rests still in the key 'url'.

Edit:
Another twist. Using data from ytplayer.config.args.adaptive_fmts does always ensure working URLs for the various media formats of a video. However, as I learned the hard way, depending on whatever factors YouTube utilizes, sometimes videos - even ones that could be downloaded seamlessly before - can not be downloaded. Attempt at downloading media using the URLs from adaptive_fmts gives an HTTP 403 'Access is denied' error. Turns out, and of course this is from studying youtube-dl's source, sometimes, and in some videos, YouTube uses a kind of encryption. Say, the following is a token - one media file's information - obtained by splitting ytplayer.config.args.adaptive_fmts string from this video's webpage with a comma ',':
sp=signature&xtags=&quality_label=1080p&projection_type=1&fps=24&bitrate=4384085&s=22842844852A956D7573EA30727E30F9DE2013D783BFF5.61C376EEF3FEE2A54A57DBC972DAC4E972C270A7777&lmt=1540713480713402&type=video%2Fmp4%3B+codecs%3D%22avc1.640028%22&size=1920x1080&init=0-714&clen=92248685&index=715-1250&itag=137&url=https%3A%2F%2Fr2---sn-fapo3ox25a-3uhs.googlevideo.com%2Fvideoplayback%3Fid%3Do-AIgcdy13GCdDPGcw75ItSjM5wanZJWPZAxICUV_Ux5hm%26aitags%3D133%252C134%252C135%252C136%252C137%252C160%252C242%252C243%252C244%252C247%252C248%252C278%26itag%3D137%26source%3Dyoutube%26requiressl%3Dyes%26mm%3D31%252C26%26mn%3Dsn-fapo3ox25a-3uhs%252Csn-npoe7ner%26ms%3Dau%252Conr%26mv%3Dm%26pcm2cms%3Dyes%26pl%3D24%26ei%3DbvK6XMbALMiI1AbXkbWgAw%26initcwndbps%3D360000%26mime%3Dvideo%252Fmp4%26gir%3Dyes%26clen%3D92248685%26dur%3D219.719%26lmt%3D1540713480713402%26mt%3D1555755499%26fvip%3D2%26keepalive%3Dyes%26c%3DWEB%26txp%3D5432432%26ip%3D77.94.26.114%26ipbits%3D0%26expire%3D1555777230%26sparams%3Dip%252Cipbits%252Cexpire%252Cid%252Caitags%252Csource%252Crequiressl%252Cmm%252Cmn%252Cms%252Cmv%252Cpcm2cms%252Cpl%252Cei%252Cinitcwndbps%252Cmime%252Cgir%252Cclen%252Cdur%252Clmt%26key%3Dyt8
If we're to directly use the media URL here (the part after 'url='), after URL decoding of course, we're going to be greeted with a HTTP 403 Access is denied error. Strangely, doing exactly that worked just yesterday. Anyway, however and whenever YouTube decides to put in encryption to a video notwithstanding, we can surmise that encryption exists by checking the presence of a parameter 's' in this token - or try downloading using the URL naively and check for a 403 status code in the response. youtube-dl checks for the presence of 's' param. That's what I do as well. In the above token, we can clearly see the following string:
s=22842844852A956D7573EA30727E30F9DE2013D783BFF5.61C376EEF3FEE2A54A57DBC972DAC4E972C270A7777
The part after 's=' is the encrypted signature. We need to decrypt it - in fact, YouTube just scrambles the characters around - and tack it on the end of the URL as the parameter 'signature'. To find the decryption function, we need to look at ytplayer.config.assets. If this path contains the key 'js', its value contains path to a JavaScript file responsible for the signature decryption among a ton of other things- probably setting things up for the HTML5 player? If there is no 'js' key, the video probably uses SWF player(from what I can gather from youtube-dl source). For this video, at the time of writing, ytplayer.config.assets is the following:
{css: "/yts/cssbin/player-vflJHbzHK/www-player-webp.css", 
js: "/yts/jsbin/player_ias-vfloNowYZ/en_US/base.js"}
I'm not a hundred percent sure but judging by the way YouTube likes to randomize things, this JSON object could change any time even for this video. Anyway, going to youtube.com/yts/jsbin/player_ias-vfloNowYZ/en_US/base.js and downloading the file(~1.2MB), one thing is immediately clear: it is a mess. youtube-dl uses some regex filters to search for and extract the decryption function in this JavaScript file. youtube-dl also has a light JavaScript interpreter built expressly for executing JS code frequently employed in decryption functions. Though the decryption functions vary between videos, the range of JS functionality used in them is limited : split(), slice(), splice(), reverse(), join(), operators, function calls and so on. Using the JS interpreter to run the extracted decryption code on the 's' value from above gives the unscrambled signature. In this example, the decrypted signature turns out to be
42852A956D7573EA30727E30F9DE2013D783BFF5.61C376EEF3FEE2A54A57DBC972DAC4E972C270A7
So, all we need to do to get a valid URL is tack on '&signature=42852A956D7573EA30727E30F9DE2013D783BFF5.61C376EEF3FEE2A54A57DBC972DAC4E972C270A7' to the end of the URL in the token presented earlier:
https%3A%2F%2Fr2---sn-fapo3ox25a-3uhs.googlevideo.com%2Fvideoplayback%3Fid%3Do-AIgcdy13GCdDPGcw75ItSjM5wanZJWPZAxICUV_Ux5hm%26aitags%3D133%252C134%252C135%252C136%252C137%252C160%252C242%252C243%252C244%252C247%252C248%252C278%26itag%3D137%26source%3Dyoutube%26requiressl%3Dyes%26mm%3D31%252C26%26mn%3Dsn-fapo3ox25a-3uhs%252Csn-npoe7ner%26ms%3Dau%252Conr%26mv%3Dm%26pcm2cms%3Dyes%26pl%3D24%26ei%3DbvK6XMbALMiI1AbXkbWgAw%26initcwndbps%3D360000%26mime%3Dvideo%252Fmp4%26gir%3Dyes%26clen%3D92248685%26dur%3D219.719%26lmt%3D1540713480713402%26mt%3D1555755499%26fvip%3D2%26keepalive%3Dyes%26c%3DWEB%26txp%3D5432432%26ip%3D77.94.26.114%26ipbits%3D0%26expire%3D1555777230%26sparams%3Dip%252Cipbits%252Cexpire%252Cid%252Caitags%252Csource%252Crequiressl%252Cmm%252Cmn%252Cms%252Cmv%252Cpcm2cms%252Cpl%252Cei%252Cinitcwndbps%252Cmime%252Cgir%252Cclen%252Cdur%252Clmt%26key%3Dyt8&signature=42852A956D7573EA30727E30F9DE2013D783BFF5.61C376EEF3FEE2A54A57DBC972DAC4E972C270A7
URL-decoding (done twice here for readability) the above gives:
https://r2---sn-fapo3ox25a-3uhs.googlevideo.com/videoplayback?id=o-AIgcdy13GCdDPGcw75ItSjM5wanZJWPZAxICUV_Ux5hm&aitags=133,134,135,136,137,160,242,243,244,247,248,278&itag=137&source=youtube&requiressl=yes&mm=31,26&mn=sn-fapo3ox25a-3uhs,sn-npoe7ner&ms=au,onr&mv=m&pcm2cms=yes&pl=24&ei=bvK6XMbALMiI1AbXkbWgAw&initcwndbps=360000&mime=video/mp4&gir=yes&clen=92248685&dur=219.719&lmt=1540713480713402&mt=1555755499&fvip=2&keepalive=yes&c=WEB&txp=5432432&ip=77.94.26.114&ipbits=0&expire=1555777230&sparams=ip,ipbits,expire,id,aitags,source,requiressl,mm,mn,ms,mv,pcm2cms,pl,ei,initcwndbps,mime,gir,clen,dur,lmt&key=yt8&signature=42852A956D7573EA30727E30F9DE2013D783BFF5.61C376EEF3FEE2A54A57DBC972DAC4E972C270A7
Which should be downloadable. Of course, the above link doesn't work because I've changed the 'ip' parameter. But done properly for any video, these steps should provide the right URLs for the required media files.

Btw, youtube-dl seems to directly use ytplayer.config.args.adaptive_fmts for getting a hold of the media URLs coupled with encrypted signature checks and decryptions if needed of course. It doesn't first check if downloadable links are available in ytplayer.config.args.player_response.streamingData.adaptiveFormats. I might have to switch to directly using ytplayer.config.args.adaptive_fmts right off the bat myself since it always seems to have the links to all available media anyway.
Also, interesting lines in youtube-dl's youtube.py:
youtube.py line 1794,1831,1891,1130(loads signature from cache),1144(loads decryption function from player code)

Tuesday, March 26, 2019

Nepali EPapers

A simple python 3.7 script to download common Nepali newspapers as jpegs.

Monday, March 18, 2019

Traveling salesman problem with Genetic Algorithm

This is a try at an approximate solution to the famous Traveling Salesman problem using the Genetic Algorithm heuristic. There are countless articles on the topic out there. I just wanted to make my own 'general' GA module that could be used for any suitable problem. So, here is the implementation. As in the last post, the videos by The Coding Train on the subject were really helpful.

Some observations I made while playing around with the model:
- Size of the population was directly proportional to the amount of time it took to converge to a solution up to a point. After that, the computations for the increased population probably swamps any gains to be made by the large diversity of the large population.
- Mutations are pivotal. Setting mutation rate to 0 will make the algorithm settle pretty early on on a local optimum which often isn't a good enough solution.
- There's probably a lot that could be improved on it. Maybe a large initial mutation rate that decays over time could overcome local optima in favor of global optima. Maybe improved fitness function. Improved cross-over, maybe a stronger mutation.

Anyway, here is a demo gif:

Sunday, March 17, 2019

Weighted random picker

I've been dabbling in Genetic Algorithms as of late. Great videos here on the subject.
In implementing a GA using JavaScript, I came across the problem of selecting into a mating pool individuals from a population depending on their fitness values. Talking in terms of code, it basically boils down to picking from a list of objects conforming to a custom distribution defined typically by certain 'weights' assigned to each of them.
I first tried the most obvious and the most naive solution: to generate a new list consisting of the original list's objects with their population corresponding to their respective weights. So, for example, if A,B and C are the objects with the weight distribution of 50%, 30% and 20%, I created a new list of, say a hundred total objects with 50 A's, 30 B's and 20 C's. Then, if I choose randomly from this new list, I should get roughly the desired distribution of our original objects. But such a method quickly led me to a whole host of problems: for starters, for large original populations, the individual probabilities/normalized weights could be arbitrarily small - this meant, the new pool's total population needed to be correspondingly large enough to reflect all the small differences between the original population's members' weights. Ignoring this fact led to diminishing diversity in subsequent generations and owing to that, a complete collapse of the mating pool after a number of iterations.
I got a couple of solutions to the problem upon researching about the topic from the internet but most of them were, while fully functional, unintuitive. There was a lot of head-scratching and frustration-filled deliberation. I knew that for a weighted/biased true or false logic, the fact that a random percentage needs to be less than a given percentage or occurrence rate can be exploited as a good decision criterion. For example, if the desired occurrence rate is 1%, choose a random percentage and if it happens to be less than 1%, take the outcome as true. It is obvious that the distribution is going to be heavily skewed towards false. It works and checks out with the desired rate of 1%. However, when it comes to a bunch of different objects each with their probabilities/relative weights, the logic can not simply forgo the interdependence that inherently exists between the such objects. Using the same idea willy-nilly on each member just doesn't work.
Then, I got an idea of my own. Somehow, an object's probability score had to map to its selection and that process could not be deterministic. Just because a member has a score of 80% shouldn't always mean a yes, and shouldn't always mean a no. It should be randomly determined but even in randomness, the choice should be a yes more often than not. What if I multiplied each member's probability with a random percentage? Some members would get a large random percentage and some, small. Since the random numbers themselves don't have any particular bias, the resulting 'scaled' probabilities will be random with a slight skew defined by member-specific weights. Because of the imposed biases in the form of weights, more often than not, members with larger weights will on average, score higher than others. This won't always be true but on average, the frequency that a member will have the highest final score will be proportional to its assigned weight - the greater its weight, the easier it is going to be the one with the highest score. Members with lower scores will occasionally be the top scorers - but rarely. And when we pick the highest scoring members enough number of times, the resulting distribution will reflect this phenomenon.
Phew! Enough rambling.
Here is the code.
For a piece of code that small, this is such a big explanation isn't it?

Tuesday, March 12, 2019

Dijkstra's algorithm for most optimal path

A friend of mine who studies Operations Research in Transportation Engineering suggested that I look into the problem of path optimization subject to one or more constraints and provided me with a paper he wrote some time ago. I perused the documents and was fascinated by the topic. Following some internet research, I thought that I could start by trying to implement a shortest path algorithm known as Dijkstra's algorithm.
This is a python implementation of Dijkstra's optimal path algorithm using a adjacency matrix from a csv file. The direction of travel is row to column. Inaccessible path is represented by a distance of 1000. Given a starting node, the program looks for its neighbor nodes and for each, calculates and stores the distance from the starting node and its index. Similarly, for each of the neighbor nodes, their neighboring nodes are sought in turn and the shortest cumulative distance to each from all possible routes to them along with the corresponding parent node index is stored. This process is continued until all the nodes have been visited. In the end, each node has two important pieces of information attached: the parent node corresponding to the optimal path and the optimal total distance to the node from the origin/starting node.
This way of going about the shortest route problem is based on this explanation which I found very helpful.
Optimal path from node 2 to node 15
Manual Dijkstra iteration along with content from the aforementioned paper on the topic

These Google Maps pictures with the assignment of nodes from 1 to 15 have been directly excerpted from my friend's paper.
Code @ GitHub

Friday, February 22, 2019

Convex Lens Simulation using Processing

Uses an ellipse as a convex lens to refract rays of light from a point source.

Simulating something that IRL seems so intuitive and mundane as the phenomenon of refraction in a lens was inordinately difficult. Reaching this point required multiple redos of the 'project' and three full days of wondering, scratching my head, sketching, scrapping, writing, deleting and rewriting - over and over. The main hurdles I faced are:
- Detecting accurately that a ray has hit the air-lens interface
- Figuring out whether the refracted ray should rise or drop
- The fact that vertically down is the direction of positive Y axis for computers. This probably constituted the biggest source of headache and confusion though I'm not sure if that was actually warranted.
- The fact that the Processing IDE is wanting in the most basic of features one expects of a typical IDE.

I'm NOT using Processing again for this kind of thing. Whew!
Anyway, I consider this one done:
Code

Sunday, January 20, 2019

Easy Nepali Calendar

I wanted to make something using Electron. So, I thought a Nepali calendar program for windows would be a good idea seeing how there's not many of those out there, especially ones with a 'modern feel'. The first version pulled the Nepali date from the web but I went on to use this module for the date logic instead in the second. I'm releasing only the second version here.
Here is how it looks on an empty desktop:
Download here