Introduction
I am Harshil Jani who is an Engineering Student at NIT -Surat in India. My majors are in Electronics and Communications and Minors in Computer Science. I am passionate about Open Source and Linux. I am not expert of technology. But , I love spending my time with tech.
Contributing to Open Source Software like LLVM, Gitpod, Fixing Small changes on other random software and web apps have been a part of my open source journey. I learnt a lot about Web Based technologies, Compilers, Simulation Software, Linux Distros available in market, Embedded Systems, Communication Principles and much more.
This book is about my Google Summer of Code - GSOC journey. My way of writing the things is pretty not so straightforward. I will drive you to many domains and make you believe that managing time to learn so much out there in the world of computing is not really as tough as a newbie would think of it to be.
Let me just tell you that, I am Learning Rust
and the MdBook is great Rust Software to allow you to write books just with the help of markdown files. So, While I am writing this book about my GSOC Journey I am learning Markdown
Syntax in more complexity if I ever require to do so, If I choose I want more customizations to my book, I will update the Rust Source code of MdBook. And side by side, I will describe my work in Perf
, HTML
, CSS
, JavaScript
, D3.js
and other technologies involved. You see, This is how you use your technical knowledge to create something which is unique. There are tons of medium blogs about GSOC Journey. But I want to pen portrait this in a book, Because I am impressed by the book named as "Crafting Interpreters" by Robert Nystrom. I too want to make sure that some day soon as I grow older, Some underground kid from a normie lifestyle would read my book and have his own unique creation out of it.
I am Contributor at CERN-HSF and I will describe my whole journey with you to let you know how I made it to GSOC and What I am going through right now and what next in the future I will be doing in that regards. We will go through every phase of my exploration.
I want to bring out the Christopher Columbus
out of the readers who will eventually build up the habit to google the stuffs which I am lazy enough to describe and will be handing over to you as a task.
Don't expect me to explain the features of this book, You can by yourself click anywhere and learn what things goes on when you click here and there. Like Changing themes compatible to your beautiful eyes or may be jumbling through the chapters.
Exploring Open Source
It was time of September when I being a kid was exploring the Web technologies and some higher level languages while playing around basic programming questions. I got to see many posts on GSOC but by this time, I was pretty much aware about what GSOC is and how it works. I had read few articles on it and some videos about it. So, I came to know that, It has something to do with Open Source. I was luckily, a Linux learner back then, So the curtains of open source were unveiled to me. I had been through the history of great computer scientists and developers who brought the revolutions like Linus Torvalds, Richard Stallman, Ken Thompson and others as well.
So, I just needed a kick start of contributions. But by this time, I had already used Git, GitHub for my own personal projects and in some small not so notable hackathons and events. So, I knew pretty well about the terms used in it like Fork, Clone etc. more than my peers because I had founded a failed community with almost 2 Regular members named as Nactore. So, At Nactore, I wrote the articles on Git and GitHub which hardly any of my fellow peers have read. It was tough for me to find an audience as I was not good at marketing my stuff. A bit of Shyness and my own set of principles, disallowed me to continue Nactore Forum (A shitty site hosted on Wix) and It just boom vanished.
I saw a video on FreeCodeCamp Guide to Open Source and learned how you should use GitHub more efficiently and find good issues for you. Then I started doing so. As said earlier in my Introduction, I was learning C++ in more detail so I started contributing the Fixes in DSA repository in Tenet Coding Organization. Soon, We came to October, Which is popularly known as HacktoberFest in the open source world. So, I asked out the ACM Club in my college to organize something for it. At the same time, I had frequent contributions in TENET. So, I was given an opportunity to mentor the HacktoberFest 2021 in Tenet. So, I did so, Merged PRs and Rejected PRs and also contributed to other projects by myself. Now, Hackoberfest was over and I got a T-Shirt for the same by getting my more than 4 PRs Merged.
To those who are entirely new, Google the term "Pull Request and Issues"
Now, I was participating to other Events like GirlScript Winter Of Contributing (GWOC), Winter Of Code 2.0 (By GDSC IIIT Kalyani). GWOC is all about documentation. I wrote some technical articles and got my PRs merged. This was the time, I was working on some large project (by quantity) where you might face merge conflicts, Issues closing and reopening, commit squashes etc. I anyway learned many things and standards of open source into this event. By this time, I started exploring the GSOC organizations and Previous Project " It all Comes by Exploration ".
If you never visited the GSOC site and explored Archives, Go do that.
In WoC 2.0 by Kalyani I saw LLVM out there which was based on compilers. Other orgs were not catching my interest as I was fed up with web development everywhere and wanted to try something more beautiful. I got a proposal for the same and sent it to mentor out there. It got accepted and then I told my mentor, I know nothing about compilers, What will I do ? He suggested to do some online study work. I got enrolled in a Course by Stanford in Coursera and learned about compilers. Now, I was being told to build the LLVM. But I had a boring Laptop of 4 GB RAM which was not sufficient. So, I skipped the program by informing mentor and kept things on hold. Meanwhile, I continued with my course and Learnt more around it. Fast forward to Late January 2022. I got myself a new laptop with 16 GB. and I build the software on my system by myself. Let's not get more in detail of my WoC 2.0 Journey which was really a breakthrough in my open source exploration. All thanks to Nimish Mishra Who helped me with the contribution process. If you want to know in detail about my LLVM journey Read this article of mine. LLVM Journey
I turned out to be a top contributor in both GWOC (Open Source Domain) and GDSC's WoC 2.0. By now, I was comfortable with the open source projects. Now, It was month of February and peak time for all GSOC aspiring candidates. Let's read about GSOC into next section.
Exploring GSOC
Since, I want to be part of GSOC, I started exploring more and more organizations and tried to contribute consistently. I had an eye on LLVM,CERN-HSF, Fossology, GFOSS, SPDX, About Code and many others. But, One thing that caught my eye was CERNs workflow. There was an Internal round which you need to get cleared with the mentor in order to discuss the project further in detail. All I did was I went through the requirements and tried learning more about the projects. I like 3 Projects in CERN which were :
- Geant4 - Data Visualization
- ROOT - Cling Interpreter Support for adding libraries
- CERN-Box - Migrate Smashbox to Python3
I frequently visited the site and read the requirements of the Projects. Meanwhile, I was wondering what is Data Visualization and I saw a video on the TEDx regarding the same. It was very good explanation and presentation about Data Visualization and the best thing which I learned from it was " It allows you to raise questions that you might not raise with just the raw data without the visuals ." This got my interest more and more with the Geant4 project and other part was perf profiling tool for Linux. So, I was little aware about Valgrind which is also similar kind of tool. So, What I did was simply, Learned more about this two technology and wrote two articles on Medium. You can read them to know in detail about how I used them to gain more insights and knowledge about the techs involved into the Project. Data Visualization using D3.JS and Perf
Parallelly, I was also in attraction of a Simulation Software named as Apothesis which simulated Kinetic Monte Carlo processes. The project was about creating the Parser and generalizing the Input and Output for the simulation software. This was under Greek FOSSology Organization. I was in deep conversation discussing both the projects in parallel. In CERN's first round, I was given a task to create a scatter plot based on raw sample file along with some bonus changes. I made the chart and submitted the project in 1 day. And I got a green signal by my mentor Guillherme Amadio to start working on proposal and discuss the part later on the proposal itself. I had plenty of questions and he helped with all of them. Finally, I had a proposal ready for submission. Meanwhile, In GFOSS also, I had more clear proposal refinements. You can always mail me @harshiljani2002 in case you need any of my proposals.
I had my End-Semester exams going on in the time when the proposals were being reviewed so, I couldn't really get time for more contributions. So, Fast forward, The results were announced and boom
🎉 My proposal was accepted at @CERN-HSF for Geant4 - Performance Analysis and Data Visualization.
Community Bonding Period
Before we actually get started with the project, There is a phase of time known as Community Bonding Period. Each Organization have their own set of rules, workflows and hierarchy to follow. To get new contributors familiar with the organization this period is crucial. In this time, Students get familiar with the work to be done in their community, know their mentors, plan their project in more details, receive guidelines for contribution, gets resources to gain knowledge on any topic etc.
Talking about my personal experience, On 20th May I got to know about the selection, Then I got a mail from mentor about an introductory call which we scheduled. In that, All of us Me and both of my mentors Guilherme Amadio and Bernhard Manfred Gruber had a brief introduction. Further, We had more project oriented discussion about what part am I supposed to look on more often. Then I got some links to Guest Accounts in CERN which was to be used for further communication and activities. After, The first call, I was playing around the repository and had few errors and questions. So, We on the next week had another discussion call where they helped me get these things fixed. Then I was advised to start working on the project in order to get started. Mentors taught me how the reports were generated, How to generate CSV reports and I got to know more about perf. They shared few presentations and documentations by them which helped me in setting perf. Now, After this I generated the reports by myself, Converted them into HTML Table. At first I used a Python script using Pandas but It was not a good idea because it doesn't allow much customization. So, I figured out by myself and got those table by using D3.js logic itself. Then once the tables were on the page, I worked for filtering out them.
I also got cleared that we were working on CMS which is a general purpose detector at Large Hadron Collider (LHC). I visited the CERN website and had a virtual tour of the CMS. Everything seemed like a rocket science to me, But I was blissful to have even a virtual tour of it. I wonder how great the people working there might be. Meanwhile, There was a GSOC Global Summit on 3rd June. So, I have attended the whole summit. I have noted few learnings which I took from the summit. Here is an article referring GSOC Summit Learnings
Meanwhile, I explored more and more about the project and Now, I was able to explain to people that CERN has CMS detector which was simulated by Geant4 and I am profiling that software and providing the visuals for the same. I also explored the Phoenix software of CERN-HSF which simulates CMS, ATLAS and other huge models out there at CERN. Also, Currently the GDML
file is not available for ATLAS so we could not add up the visuals for ATLAS currently.
Google " GDML " for learning about it and also for virtual tour to CMS just google CMS CERN
At the end of the period, I already had a rough progress to me made and things were under control. Now, It was time to get into the coding journey since the community period was about to end. I also found that, My mentors were quit amazing personalities. One was maintainer at Gentoo Linux and other was very good at Free Running and Parkour. They had really good work life balance I guess.
Week 1
Me and my mentors have settled for Weekly Progress calls about discussing the progress of the project and what next we should be looking into it. The Week starts with this call itself. Now that, I already had a rough progress set, It was more easier to pick more clear ideas based on it.
Let me show you the minimal version of the site which I already had at the start of Week 1 and by the end of community bonding. This is still better than what I had for my very first progress. I misunderstood and took the perf text reports and stick them to the site. But then mentors corrected me saying, They don't actually need the Text or CSV report and it was all about converting the generated CSV report to HTML Tables.
In this, the table has been loaded from a demo.csv file which was not much filled with data. It was just a sample data file with very less number of entries. Now, If you look closely, There is a threshold based filtering part which I had implemented based on a For loop looping through all the entries and returning the ones that falls under that particular range. It was working well.
Let's get back to discussion in the weekly call. Now, I was expected to add colours to the Tables based on the Metrics and also we had further discussions about the project for each chart individually. I coloured whole row based on a single metric. But, then I was corrected and expected to colour not just whole row but each cell based on their metric measures. So, I changed it as well and made each cell coloured.
Now, One part of the project was to convert the perf reports to csv file using the awk script which was done by my mentor Guilherme Amadio. There were now more than one CSVs and we should allow the user to visualize all the reports. So, Next thing which I did was generalizing the loading of CSV file into the site. Currently, All you have to do for adding a CSV file is to add it inside the Data/
directory and add the name of file to array. That will load all sort of files. Later I am thinking to even generalize array and let it read files directly from the directory itself. As I fixed this, I was showing this site to a close friend of mine. She, found a bug in it. When I was generalizing the reports, I completely forgot about the Download CSV Button. The file path for the same was broken so when she clicked on it, It gave her 404. And then she reported this to me and it costs me 2 Lollipop Candies😑😑. I fixed this up quickly, So it won't cost me candies and she won't get cavities.
Now, There was some real problem with the threshold. Since, last time I used perf on the reports and generated the CSVs, I got actual data which was quite large compared to my demo data. So, The for loop for threshold was taking much longer time. It was the best possible way to find the entries in threshold ! Because it was O(N) I though like this. But, When I discussed this issue with my mentor he suggested me to apply the filters just using the d3 logic so that, It will get those things coloured during the transpilation of d3 to JavaScript. He shared a sample snippet of how it could be implemented. It helped me in drawing the idea for the same and I was able to implement the logic. But, It also made huge changes into the code. Whole logic was shifted to d3 and every time new thresholds are there, We need to load the whole CSV to Table reports. It was definitely helpful and worked really well as compared to my loop based logic.
The design of site was not so great, Because I had used view-height and view-width everywhere to adjust things. So, I was suggested to use Grid or Flex and make it more responsive. I was also shared tutorials based on it. I was more comfortable with Flex comparatively so I applied the flex to whole site. I think it now looks much better than what It was looking before.
Here is the Screenshot of the site at then end of Week 1.
TLDR :
- Coloured the individual cells based on metric.
- Converted perf Reports to CSV with the help of awk scripts (provided by Mentor).
- Converted Threshold Logic from a
for
Loop to d3.js logic. - Generalized the file input for loading CSVs.
- Changed styling of the site using Flex-box.
Week 2
This week was really a week of work and I loved being involved with the Tasks. Just to mention apart from my GSOC, I was selected as a contributor for C4GT which is "Code for Government Technology" Program held by Samagra-Shiksha a part of Indian Government. Now, I was part of two beautiful open source projects from two beautiful Organizations. So, I was expected to learn about React-Admin and JSON-LD and much more popular and underground technology. It was really a booster to my learning curve working with both simultaneously. C4GT is whole different game, If I get time you can expect a separate book for the same by me.
We had a meet at the start of the week where we discussed the progress made last week and then got our next week's tasks listed. Let's talk about all in details.
As you might have already seen the image from last week, The first task was to get the numeric columns before the comm
, dso
and Symbol
. So, I made changes in the script which converted the perf reports to CSV to print the numeric columns first in order. This was easy fix.
Now, the current measures makes no sense to the readers. Raw data is not useful in any ways. So, We need to convert the data into some relative measures. For Example, In measuring the CPU metrics, If the data file consists of Cycles
and Instructions
then we need to also add a measure of CPI
. Also the raw data must be formatted like The Cycles and Instructions must be percentage of total of the respective measures. Similarly for Cache Metrics if we have branches
and branch misses
then we must add a measure of Branch Miss Rate
.There was an issue that some metrics were not calculated by the Intel processors so we couldn't extract those data (L1 instruction loads). So, I spent some time fixing the code for including new metrics and improving the existing ones. At the end, It looks really nice. Don't worry, At the end of page, I will stick the SnapShot.
I was expected to install the CVMFS
which is Cern VM (Virtual Machine) FS (File System) and mount the geant4 public repository which would create the bare clone on my system and allow me to run the jenkins reports So, I can run the perf reports available in the CI for Geant4. This was important to do because the current flamegraph which I had was not showing up the actual measures. So, I had to set this up and run the jenkins script. Now, Let me describe a problem faced by me in doing this. I use Ubuntu 22.04 ( Sorry to Linux Community, It might hurt the sentiments for using Ubuntu ), But CVMFS don't have the Fuse Libraries set up for 22.04 Jammy Jellyfish. So, I went to the GitHub Hosted repo and found there, they are working currently for this. But there a simple work-around was suggested, Which was fetching the OpenSSL library by yourself and then using the Fuse for 20.04 as it is because the SSL was not packed by default in Jammy Jellyfish. While exploring this, I found out that the suggest work-around was upgraded to 2.15 form 2.13 and for some reason 2.13 was not existing any more on the umd mirror. So, I commented a new work-around in the issue itself and got the CVMFS settled on my system. Now, It was working fine and I mounted the Geant4 repo in it. Next was the CCACHE error while running the jenkins script. I fixed the report script of jenkins and made a pull to the original repo of g4run. And I started running the script. It was working fine, But meanwhile for some spooky reasons my perf got broke and it was running continuously. So, I was not actually able to run the jenkins script. This is something I need to do it in presence of my mentors in the call itself to let them help me figure out my blunders.
Another task of project for the week was to implement the Tree-Maps from the data which was coming out from the perf reports using a script provided by Guillherme Amadio
. Next, I wrote the algo for the treemap. And now, We had the treemaps in the site. But, One problem with them was, Initially When I tried loading the treemaps It won't get rendered. Because the treemap nodes were required to get the offset positions of their measure on the screen size. But, I had used a property of CSS to let the contents of the site being hidden and it was display : none
So, All the coordinates were returning 0. It took me a while to figure out why the treeMaps are not loading but finally I found a way out of it to get the display : none
property being mocked without actually hiding the elements.
So, All I did was replaced it with visibility : hidden
and position : absolute
. This helped with the site being able to  render treemaps along with the other tab contents. It also improved the buffer performance of the site, Which I measured using performance analytics from the Chrome Dev Tools (Beta). I also added the Inverted Flamegraph but those will be soon replaced by The real working flamegraphs.
As promised in the middle of the text (If you read it fully), Here are some glimpses which depicts the progress at the end of this week.
Week 3
This was most toughest week for me not in terms of GSOC but with multiple things coming over the head, I got insomniac throughout the week. It was not fun while working. Nevermind tho, from my experience I can tell that, there are always ups and down. Nothing much to worry about from it. We will soon recover things.
Let's not brag it more and get started. On Monday we as usual had a meet. Wherein we discussed the progress past week and then made few changes required. I got my very first GSOC PR merged into the repo where I have to mainly work. My mentor Guilherme Amadio
helped in getting through the changes.
I had successfully mounted the geant4 on the CVMFS as stated last week. This week we tried running the jenkins script from the Geant4 hosted on CVMFS. Then the team needed the reports of Difference per function which I generated by hand and added it to reports. Main part was customizing the treemaps based on metric used for diving it in specific Area, Color or Tool-tip. It was implemented as well. But it considered many edge cases which were causing if else
statements. But functionality comes first, And the performance of the software doesn't gets affected so it really doesn't matter. In future if we have time, We can think to make it more generic.
I also changed the colour all over the reports and treemaps to green white and red scheme which inititally was blue and red. It now improved it visually.
In the meeting, I had my screen shared and I was using https for my git clones and fork. So, I accidentally exposed my Github Token in a shared screen to my mentor. And he was like - "You are not supposed to show me this !! " So, He resorted back to me saying that I should use SSH for my forks and clones. So, I converted few of my repos to SSH clones. While working, I messed up with my repository. So, I was instructed and taught git commands in more clear and hands-on way. I did the required changes and cleared the mess.
All of a sudden perf
got broke on my laptop for no reason. I cannot figure it out how it just happened. but it was still working fine with the sudo
But it might mess up ahead in the repo. Anyways, I ran the jenkins report but it was way too heavy for my PC to handle. Around 44 Gigs were occupied what was weird for me. So, I got to know that perf is memory hungry. It is common. But a fix to this was generating only the e- electron
reports and exclude others.
I was also expected to read an article on Top Down Micro-Archietecture Analysis
but as I said, I had a rough week this time. My CodeForGovTech project's Mid Evaluation was on the head So I had to speed run this week as next week I will be busy with the marriage stuff (Luckily not mine). I travelled about 125 miles and did my treemap stuff in the bus itself. And got a no sleep week with about 2-3 hrs of sleep daily that too at irregular basis. Gave an Quiz to Flipkart Grid thing and attended Amazon School Program. It was all coming up together, I was about to quit few of it but luckily got things managed crazily and got done with my stuff in more better way. This was hectically a fun week. But yeah, I won't recommend following this any longer. It degraded my working performance. Anyways, as per tradition here is the ScreenShot to my this week's updates. And the link to the PR which got merged.
PRs : https://github.com/amadio/g4run/pull/1
ScreenShot :
Week 4
Welcome to another Week. Now, I am getting more clear knowledge about the Project, CERN, CMS, Experiments at CERN, Work Culture at CERN, About Physics, World's working and lot more.
I am really enjoying learning more and more about LHC. I must not say learning but It is more like knowing. I am loving to know more about LHC (Large Hedron Collider) and also about Higgs Boson particle. These things are way beyond my project work but I am in love with things. They are really making an impact to lives on this planet. And it have new possibilities and capabilities. I really love things at CERN by this amount of time working here. Meanwhile, I am also about to participate into WebFest which used to be a weekly hackathon but this year, It is stipulated over the month of time. And luckily, Their event theme is Environment and one of the project theme is Data Visualization. Thanks, To my GSOC Project, I will be having a good idea about the project compared to the case if I was not a part of GSOC under CERN. Let's rock and roll into the CERN's Webfest. Now, Let me give an update about my Week so far.
Starting off from Monday, In the meeting, I was not at my home so I was facing fluctuations in my internet connections and was barely able to discuss with my mentor in the meet. Still we got few points cleared for the week. Actually, This week was about Scatter Plots but due to my involvement into the wedding, We had settled on shifting the Scatter Plots to next week which was also being occupied for Advanced Scatter-Plots. Then we discussed a few thing about the statistics to be derived from the existing metric and the new ones to be added soon to the web. I then got a sudden thought of breaking down the difference report separately from the raw data because the diff report was of more importance to the developers. And then I got a "YES" for doing so. And we ended the meeting with poor internet GoodByes.
Now, Coming to the work. My mentor already provided me the extra data to be added before he would go to Brazil for his vacation. For the first 2 days, I was busy with functions in the cousin's wedding. That Simply means I was off from my work. Indian marriages have a lot of traditions and customs to be followed. And being a brother, You have many different roles to be played in the wedding. Then for next 1 day I was again in the bus coming 125 miles back with 50 more miles being added before coming back to my home while performing other half of tradition which was done after the marriage. So, Another day off. Meanwhile, On Sunday I had a Mid-Eval so at last moment my mentor from C4GT popped up saying to me he required a major change. We toiled for 2 days and made the stuff being done. And I am glad that my C4GT project is also adding a lot of value into the Indian Government's Infrastructure. So, This whole week, I have done no work.
Seriously, You would be like - " Hey ! What the heck is wrong with you ? If I have done no work then why do on this earth the page for Week 4 Exist ?" Cool Down reader ! I was just joking. But yeah ! Joking doesn't mean lying. I did worked only on the Sunday afternoon and night and late night. I had speed run into my project on Sunday. I included new reports, Updated the Colouring to come out more generically, Popped up a difference tab out from the main raw reports. Fixed the sizing of the the table cell's height. And closed the week on late night of Sunday but better to say on early morning of Monday ! It was really a speed run which I did last time as a part of some hackathon. I love speed runs. They help you get a lot of things done in very short amount of time. It was not new for me and I am used to it so It won't affect my performance. I have practiced them a lot.
What is speed run ?
Getting to the main concluding goal as quickly as possible, regardless of how much is unlocked
I also want to share about how I made the colouring more generic through the snippet.
const max_array = [];
numeric_columns.forEach((i) => {
const value_array= [];
data.filter(d => {
if(!isNaN(d[i]) && isFinite(d[i])){
value_array.push(d[i]);
}
})
max_array.push(d3.extent(value_array));
}
The above snippet is using a D3.JS property which is d3.extent
which will return you with an array of two elements one which would be minimum of all the elements and other would be the maximum of all the elements.
Now the math which I currently use for colouring is Lower 5% of the values would be green and Upper 5% would be red and in between I use white. But I still feel there is something not so good with my math behind this. I am sharing the snippet below for reference. I will be fixing this as soon as I get to know more clear insight about this statistics.
SOME_ELEMENT.style("background-color", d => {
if(numeric_columns.indexOf(d.column)!=-1){
const max_col = max_array[numeric_columns.indexOf(d.column)]
return d3.scaleLinear()
.domain([5*max_col[0]/100,75*max_col[1]/100,95*max_col[1]/100])
.range(["green","white","red"])(parseFloat(d.value));
}});
In the domain array, The 3 values comes with [ 5% of min, 75% of max, 95% of max ]
and other two coming out from the extent with two values in [0] and [1] as min and max respectively. This comes over the case where I have If else loops for each metric in specific which was the worst idea for mankind. Currently this doesn't look good, But soon it will be definitely improved.
Also, My GSOC Blogs which are mere version from this book itself but written in more official manner have been merged with the CERN-HSF site. And they used codespell for spell check. I had few spelling errors which were checked by codespell and then I changed those and got my changes merged. I am now planning to add codespell to my GSOC book for more better Automation. You see, How you can learn a little things from open sourced program. Automation just saved time for PR reviewers not to worry about spelling errors. I also have an open issue on my GSOC Book repo for spell checks which now I am feeling was a dumb idea. Now, I will also integrate codespell to my GSOC Book.
End in End ! We won't have meetings this week. So, until next week we have already planned to bring out the scatter plot from the existing CSVs which we have with us. And below are the snapshots as per our beloved reader's tradition who looks at the screenshots at the end of every chapter. If you read my book ! I will be really thankful to you for my life. I am not so good at portraying the things, Still you stick by my side and read this book daily, You owe a huge "THANK YOU" from me.
Week 5
I could not believe that I made it this far. I still remember the nights when I used to look after projects on Github or Gitlab and today I complete my 5 Weeks into GSOC. This week was fun and important in terms of work. Since, We had no meeting at the starting of the Week but my plan was already to work with Scatter Plots and I tried making them as good and interactive as I can.
Starting off the Week, I made a simple version of Scatter Plot using just the raw data. Then I tried adding brush for selection to it. And slowly I progressed with it by adding fields to it like Selection for X-Axis and Y-Axis then Radius and Colour of dots. Also, I had a tooltip which in upcoming weeks will be helpful for showing up the Spider-Plot for specific data point. Whole week I spent adding this functionalities so I don't really think there is a need to stretch up the content so much. But I would like to share some of my insights from my current learnings and life.
Since, I had a brush selection and a tooltip both together so I need to draw out some idea about how can I make it both available. So, I decided to create a selection button, Which on turning ON will result into the selection mode and in OFF state it would turn out to become a tooltip mode. Creating this was not the actual problem. But once I created the button, There was problem while holding the state of the scatter plot when the selection changes. What I mean is that If I use Selection mode and change the measure area for the plot and then if I choose to use tooltip mode then the whole Scatter plot gets updated when you change the state of the Switch. This was real problem which I faced. But, In the end, I found a correct place to clear off the plot area only when you turn it back on for the second time. Also, With this I invented another bug where two entries of same scatter plot were being created. So, I approached this using d3.selection
and targeted the second entry and removed it from the DOM. Also, I would like to share that, My brush feature works on event selection which has changed a lot from v3 to v7 of D3.js So, I had gone through the changes made to d3's v7 since v5 and found a way out of it to make this more compatible with D3 v7 and ES6 ECMAScript syntax. It was good insight knowing about the features which got deprecated due to bad architecture and other new ones which were included due to changes into ECMA Script.
Well, other part of week I spent learning GraphQL because in C4GT my mentor thinks that they can easily reuse my project in generating the Hasura Engine present in other Gov-Tech projects. Currently, I was just working on the REST APIs. And thanks to the new requirements that I learned about GraphQL. This boosts my beliefs about accidental learning. I love learning like this. Another part is that, I started learning about FreeRTOS (by Amazon) and Embedded Systems in general because I want to know more about RISC-V processor and work with it as well. On the other hand, This week we got groups formed on Telegram and Discord Server for GSOC-22 participants. And knowing about other participants and their work is really insightfull and helpfull to learn. Then I mentored an Hackathon at a college from Gujarat University which was about 36 hrs.
Amidst the project and professionalism, I personally felt not so good due to some issues. I guess it's completely okay not to be panicked in such situations because you cannot expect everything to work the way you want to. Even if you keep adjusting and things turn out to be more demanding then there are chances you might get frustrated at times. But, make sure you already have a beautiful day gentleman if you can sleep atleast at peace. I can't and I hate myself for this. It gets even worse when someone doesn't value it much. But anyways, All I can do is hope that things will be on track someday maybe. If they aren't then I already have beautiful professional life of which I am greatful. Maybe it won't make any sense but this GSOC book is actually something which people would read only for initial 3-4 chapters and then leave it. I hardly believe anyone would ever bother to reach here. If you read whole book ! You owe so much of appreciation and Thankfullness from me.
Let's close the week by Adding the Screenshots to the scatter-plots made so far.
Week 6
This is the final week before Mid-Evaluation. This gives hopes and confidence to every GSOCer so far that what they are doing is making an impact on lives far from where they are. You embrace the work you have been doing. This is really a good week to get yourself boosted for next half. Starting of from my Meet discussions and then moving around with joy for getting my first salary,surprising my parents, migrating back to campus with a rainy monsoon struggle and fighting the hostel autoritites for the new room allotment, Trying to normalize your new semester schedule and having tons of changes in the week is all filled up with so much of excitements and randomness. Things work completely based on randomness and every second has a new coming thing. This is the spontaneousity which I always adore.
On Monday I and Guilherme had a meet where we discussed about the project. I showed him the newly created Scatter-Plots with all the interativites which I have added. It looked good apart from the fact that axis was broken a little bit. In the meeting he told me about the Geant4 Conference which would be coming in few months. So, If They allow me then I can have a chance to visit it in France. This is one of the good news which I could get. Maybe they won't invite but atleast I can have a chance to present it remotely. Also, Now the time has come to integrate my Web reports into the g4run jenkins so developers can actually visualize the charts.
I also asked about the ATLAS experiment if we have the GDML for the same. So in that regards we explored the FullSimLight which is an ATLAS equivalent for g4run as it is for CMS. I am still not sure about it. Maybe I can explore it a little bit and get myself completely embedded into the FulLSimLight as well. I then asked about WebFest at CERN and other stuffs. Final remarks from the meet were, Everything upto the Mid-Evaluation is being done. Just fix up small errors on the axes and create the skeleton for Spider Plot.
So, I worked for a good amount of time to figure out how I can show another chart into the tooltip. It got me into a good amount of brainstorming and I ended up creating something which doesn't really exist on the Web. This is one of those rare time when I create things like this. I feel good from inside when something I make is unique and being done for the first time. But Maybe, It exist for some part of which I might not be aware of. So, I cannot really claim to make this mine. I am thinking to write a tutorial just like danny Yang's with the help of which I learned the Spider Plots as Standalone Charts.
In between we had a C4GT meetup where my mentor wanted to get into RUST
lang. This was something which I never expected in upcoming 4-5 years. Like, No one from my fellows wanted to be a part of Rust community. They think C++
is working really well or maybe Python
. But when I was bangging my head on a 4 gigs Laptop for setting up LLVM I got to know why software engineering is not about writing the code but it is about writing the good code with lower end users in mind. I bet anyone If they can work on LLVM on 4 gigs. If you do it, I will never look on LLVM in my entire life. Since, I have faced the issue already I am very much interested in not ruining the days and night of some other child willing to work on some good software. For me Learning rust and implementing it is cruicial part of ethics. But seeing someone willing to know that there is a good alternative to the languages which we have currently is really nice. I have seen this for the very first time. It came out of nowhere all of a sudden.
I got my first salary from C4GT. This was something which made me very very happy. I can now pay my semester fees from all of my work in GSOC and C4GT on my own even after sparing the amount for the surprise gift which I gifted to my parents. This really kept me going for other days in the week.
But I really messed up with my programming skills last week in scatter-plots. I used to use a property d3.extent()
for getting the maximum and minimum of the values into the plot axes ranges. But I really had no idea that, I was not actually converting the data from string to number. So Let's say your data array is [2,30,99,1234,75432]
and the extent now was returning with [2,99]
so this was the thing which really broke my entire plot. And from this I realized whether I messed up the same into the treemaps or reports. But the fix to this was really simple which I got from d3.js documentation. It was about adding d3.autoType
function to the path where we loaded CSV. So, It would on its own format the data for us. And with just that everything started working perfectly.
I went back to my campus due to continuation of classes in Offline for new Semester. I think I have bunch of interesting classes for this Semester. One of my favorite subject seems to be Operating Systems. Let's see if I can perform really well into it or not. In monsoon , It will be a gambling game to take the rain-coat or the laptop with you at the places you visit. Maybe this may affect the working ability but managing them will be fun.
With the Mid-Evaluation approaching this week. I have completed all the Week schedule end to end week by week. Not even an inch of it has been manipulated and this is what I loved most about my GSOC Journey. I was afraid when I started . I used to think how can I get the charts really working. But my mentor almost left me with no stress to work on. Whenever things broke he helped me and I for most of my project journey relied on the Documentation and other likely stuff. It was really a good experience working under CERN-HSF. I think the signal is green for my Mid Evals for sure. I already had the PR for the blog site of HSF in around week 2 or week 3 which was minimum criteria to get yourself clear before evaluation round.
Let me pass the Screenshot of my newly created Spider-Plot tooltip from this week.
Week 7
Welcome to the Evaluation week my friend ! This week was a ride of thrills and overloads along with some sudden moments of surprises.
Cutting the clutter and starting with the slice of butter on my bread with Aloo matar (Indian Dish) let me get straight into the detials. Due to my awkward class timings we skipped the meet. There was not much information to be processed in the meet but anyhow, All which was to be focused was integrating the reports to the Jenkins and creating the Spider Plots.
The integration got me baffled and my mentor was spending his vacations so We settled at integration to jenkins at later stage into the project since we got all the deliverables working really well.
I was unable to define the co-ordinate geometry for the Spider-Plot untill my super powers of JavaScript and SVG got amplified while working for JavaScript projects everywhere into the world. By that I mean I am not the Zeus of JS (Read it as Rhyme), But my GSOC and C4GT project was all javaScript which is actually one of those languages which the communities rant about of which I am a underground spectator. Alright, Getting back to point, I was at good point in my life to understand what is going on into the code. So, I created really nice javaScript Co-ordinate Geometry functions and took the reference from Danny-Yang's Spider Plot. Also I made the brushing tool to work in both the dimensions which untill now was just X-Axis based. I actually wrote the working version of Spider Plot in my college library which is greately known as "The Central Library" Of SVNIT where lovers and coders and sleepy heads are found commonly.
I had classes starting straight from 8:30 AM upto 8 PM. With uncertain gaps in between. Frankly speaking my GSOC project is more kind of good and better project and it doesn't burn me out. On contary my C4GT is burning project tho. Not to be so rude about it, It burns me but my learning curve in both is kinda equal. And what else would you want if you learn so much in growth and speed as compared to your peers who only falls in the vicious loops of someone sharing the links to them. Self Exploration is the key to my work. If you think I am failing at my work then It is failure in your point of view. And If I am a learner for you then Self Exploration == Learner Quality.
I am simply passionate about working hard and doing things that don't even exists. I am still in search of a GSOC book (Take into account the time of my first commit into the repository). So, I made a really good progress with visualizing what world needs and not in what I can serve the world.
With a very busy Schedule I got my Evaluation Passed into the GSOC for my Mid-Term. Also, Recieved my Pay as well. And my friends had full plans to steal a party from me. :) Thanks to Lapinozz Friday Offer. They saved me to spend almost double amount. They give Buy One Get One Free on Friday. So, I was saved from going into my financial crisis. But, It makes everyone happy, My family and me atmost. I loved the week. I will also share the feedback from my mentor down below with the ScreenShots. I was very happy to hear the feedback. It feels good to work with them.
Maybe you might want to complain that image quality is not so good. But instead I would prefer try open in new tab
option. This will allow you to read it clearly.
Week 8
If you are my regular GSOC Book reader (Deep down I know no one is regular in reading my stuff tho :) ) then Firstly, I apologize for being late in writing my Week 8 Journey into the GSOC Book. I was a bit more occupied for the week. Without wasting any time, Let me get straight to the meeting discussion.
I had a very chaotic schedule and due to my absence of mind I mingled my GSOC and C4GT Meetings both at same time. Anyhow, I convinced the C4GT mentor to postpone the meet after my GSOC meet. Then Finally, I did my GSOC meet with Guilherme Amadio. We had a very normal talk about how's the days going for both of us. Shh ! I have a secret here. He used to be a Teaching Assistant at a University where my cousins study now (UoI), So, He had a perception that Classes are really an important part and they shouldn't be missed. But, I missed 4 classed (Not missed tho, Got Proxy XD). Alright, Then coming to main part of discussion, He was not at his office with his System so generating the new data was not possible. Because, After Scatter-Plot I was thinking to get the Sunnburst and quickly complete the Web part of my project and then jump to the jenkins integration. So, For this week, We had settled only for the Difference treeMap reports. There already existed the TreeMap report by him, I used the same and got integrated in my version with a little modifications. Meanwhile I also improved my old treeMaps a little bit.
Yeah, So that's it for this week. I really had a tough schedule going to classed and coming back and working for C4GT project and the GSOC project all at the same time. In between, I joined SSOC (Social Summer of Code) as a mentor and was very actively working in some of the project passing PR reviews and stuff like that. I came back home for the festivals and got myself settled for working more effectively so this too took my day. On the other hand, C4GT people came up with issues which seemed over ambitious at first sight but then when I got involved more and more I got to know that It isn't that hard.
Short and Sweet week tho. I promised some UI changes and Researching about using JSON files for Sunburst but I really missed on that part. But Now just after publishing this Week 8 Page I am working on the UI changes. So, In a sense that I got my work done, I haven't missed any deadline.
Hellll Noooo !!! :( I just realized while taking the ScreenShot that I have compeletely messed up with the difference dropdown selections. The diff treemaps are not rendering perfectly. I need to fix the thing real quick. Whew, Let me fix the rendering, And hope you enjoy your day !!
Here are diff-treeMaps ScreenShot
Week 9
Alright ! I have been on a huge gap from this GSOC-Book. But I retrieved all the required information and series of event in order of their occurances to remember this beautiful time of GSOC as I grow into old man. So, Let's start with the work. I have had my Difference TreeMaps at their place and found a silly mistake in stratifying the data for the Sunburst. So, I have created a Sunbust out of the existing data but I still was facing the issues while labelling the Sunburst.
Since Guilherme was about to get back to his office from the vacations, I was also at my home doing my chores and preparing the Presentation for C4GT program and giving tests for interns and waiting for replies from good companies in case they need me but got none of it yet. So, Due to very busy schedule of both of us, We had a meet at end of week on Friday. So, All I gave as update was that I created the Difference treeMap and the Sunburst.
Here is the preview of the Sunburst.
Then In the meet he suggested me that, It would be good to separate out the reports into two major parts one specifically for the Raw Reports and another for difference report. So, I worked next week for doing that tasks.
Week 10
As written in the last week, I was expected to separate out the difference report into two different tabs. Meanwhile, My mentor reached to his office and was busy with some presentation. Since, We had the meet at very end of the week so we decided not to do a meet in this week. And Also, I again was on speed run and completed my task of popping out two reports separately by Monday itself.
Then , Guilherme was working for finding a way about how can we integrate the existing reports into jenkins and he told me not to worry much about the report integration, He would mostly take care of it and once done, I can play with the Cmake file and adjust the JS variables out of it.
Another scope of improvement was after forking my repo and looking at the site a little bit, he suggested me that, It would be more helpful for the developers to actually sort the existing table files based on the column values. So, I worked for that part as well and implemented the sort feature when you click on the Table Head cell. Also, I was expected to change the Title of the Site from Peformatic Visuals
to Geant4 Performance Reports
.
I did all my changes and pushed them back to the repository. Now was the time for Jenkins Integration for which I am waiting untill I get some further Instructions.
Here is the separated Pages and the reports it contains. I was again back at campus in order to get things back on track.
Raw Reports
Difference Reports
Week 11
We had scheduled a meeting at very starting of the week on Tuesday. But, The schedule was very bad. My mentor has kids and the time was 10:30 PM for me and 7:00PM for him. So before this time he had to go to pick up the kids. So, We had scheduled it a bit late.
In the meet, I was facing audio issues and also his kids were waiting to play with him, So, Both of us were barely able to discuss from our end and had bad audio quality with a little sweet disturbance from the kids. However, On Mattermost I confirmed all the discussion topics in case I missed or misinterpreted anything.
So, The week was officially kept for preparing the presentation for the End of Program in GSOC. Since, My classes were running to beat Ussain Bolt and I was also involved into the CERN's Webfest, I got almost no time to continue the presentation. Even I was supposed to complete my blog into the HSF in this week. I am writing this page on Sunday 3AM. This is how badly I am doing with my time these days. Anyways. The presentation is remaining and I will be able to write it tomorrow that is Sunday. But I also need to complete the HSF blog and also CERNs WebFest Data Integration into the Site as I am the only one working technically. Other teammates are just ghosted out of nowhere.
Don't worry, This week is not the completion of my GSOC book. There will be proper closure of the book. But I got no progress this week into the GSOC things to which I can attach any pictures or so. Maybe I can share glimses of my CERN's WebFest site.
I know :) It looks the same but trust me. Functionality wise It will be a lot more different from what I currently have. It will allow user to upload the data and render it and other stuff If I can manage the time. Otherwise sharing my piece of code is not at all a big game I guess.