Hey everyone, I'm Aritra, a senior pursuing my B.E. degree in Electrical Engineering at Jadavpur University. This summer, I will work on adding more summarization functions to Mathesar UI as a part of GSoC 2023. I have shared my GSoC acceptance experience in my first blog (In case you have missed that, here is the link to the same). In this blog, I will share more about the project and my progress till now.
Here is my LinkedIn profile, GitHub profile and portfolio. There you will find numerous ways to get in touch with me. I will be writing more blogs on my GSoC experience, you can find them here.


What are summarization functions?

In the context of databases, summarization functions, also known as aggregate functions, are used to calculate and return a summary (sum, mean, median, array aggregation etc.) of some columns based on a Grouping column.
Let's say we have this table:

Players Level Matches Goals
C. Ronaldo International 198 122
L. Messi International 174 102
L. Messi Club 853 704
C. Ronaldo Club 968 715

Now we want to calculate the total number of matches and goals scored by each player. How do we do that?
If we know DBMS and SQL, it's fairly simple:


            SELECT
                Players,
                SUM(Matches) as Total_matches, 
                SUM(Goals) as Total_goals,
            FROM table
            GROUP BY
                Players
                    
Where SUM() is an aggregate function. Similarly, Pivot table in Excel gives us access to similar feature.

Here are different aggregate functions available in T-SQL, PostgreSQL and Excel.

Mathesar - Get more control over database

Mathesar is a project that aims to deliver database assets to non-technical personnel. A person with no previous knowledge of DBMS and Excel should be able to summarize data using intuitive and user-friendly Mathesar UI, which is our goal. Mathesar currently supports count and distinct_array_to_agg aggregates, and my mentors Brent and Sean (and me :3) will work on adding more summarization functions to Mathesar UI this summer. If you are interested in exploring the related feature requests and pull requests, here is the link to the meta issue.
Following is a video describing the newly implemented sum function. (full-screen mode recommended)



What I have learnt:

While working on this project, I have learnt sooo... many things (only two weeks have passed). Why should I expect less from such a collaborative community?
Though the project stack mentions front-end, backend and database, my project is more related to the backend (Django) and database (PostgreSQL and SQLAlchemy). During the implementation of aggregate functions, I had to study those functions as well as the current workflow of Mathesar APIs. Moreover, I am getting more familiar with TypeScript and Svelte.
Besides, Git is one of the most popular VCSs and is super-important when it comes to contributing to open-source. So, Sean and Brent explained to me some concepts of Git (like rebasing, stacking of PRs etc.) and showed me how they manage so many branches and PRs with Git. Those sessions helped me a lot.