SQL group by and get top record

(Read original article here)

Forget SQL for a second… Ever had this problem? Given 10 groups of girls (grouped by some style discriminator such as blonde, brunnete, asian, latina, brasilian1, brasilian2, brasilian3, etc), you want to pick the hottest one out of each group. I sure as hell did… Hmmm good times those were… … …AAAAAnyways, back to SQL…

Given a table with a non-unique discriminator column (style), sortable column (hotness) and other data (name), we want to select only the hottest and their names of each style. In other words group by this style column, but for other columns, hotness and name, show us only the hotness and name of the hottest girl. Here is an example. Given this [Girls] table

style        hotness         name
------------ --------------- -----------------------
blonde       5               Jenna
blonde       9               Melissa
brunette     3               Anna
brunette     10              Hanna
blonde       8               Jessica
redhead      10              Laura

I want an SQL query that returns

style        hotness         name
------------ --------------- -----------------------
blonde       9               Melissa
brunette     10              Hanna
redhead      10              Laura

AKA select only the hottest girls for each style, based on the hotness sorting criteria, not some silly group-by aggregate functions (LOL in this example)… No, I want my MY OWN custom sort function, give me the hottest! This post tells you one way of doing that:

FROM [dbo].[Girls] g1
LEFT JOIN [dbo].[Girls] g2 ON (
    g1.style = g2.style AND
    g1.[hotness] < g2.[hotness])
WHERE g2.[hotness] IS NULL

Notice how the query doesn’t even have a “group by”. Instead this guy uses a left join of the [Girls] table on itself on equal style and the joiner (g2) must have the hotness value larger than joinee (g1). This join will find a g2 for every g1 (within the same style) except for when g1 has THE HIGHEST hotness of them all, in which case, since this is a left join, it will join with NULLs. The subsequent check, g2.[hotness] IS NULL, is therefore used to pickout out only those rows that will contain g1 with the highest value of hotness. This is exactly what we need!

Bonus: filter by given styles, editing multiple lines

Forget girls now. Your table is now called [GarbageLogs]. Instead of “style” you have a column “identifier” for your discriminator. You are sorting by date now and getting the latest entry for each identifier. Your boss comes to you and gives you a list of 200 identifiers she wants you to filter by. AKA, a text file that looks like the following:


You need to add a WHERE style IN (…) clause to your SQL. To that end you need to put a comma at the end of each line, and, in case of strings, surround each one with single quotes. There are tools out there that will help you with your 200 strings:

  1. Notepad++ users: this post tells you how to edit multiple lines at the same time. Start at the top, hold alt+Shift and hit the DOWN key. One limitation, however: it will edit the same column for every line, so to put a comma at the end of every line, the lines need to be of equal length. Solution: start using Sublime text 2 you noobs!
  2. Sublime text 2 users: (GOOD FOR YOU FELLAS!) this post tells you how to edit multiple lines at the same time. Select all, hit ctrl+Shift+L. The cursor will become funky. Make sure you hit the HOME or END key, ’cause everything is still selected. This editor is wonderful…

Now that the strings are nicely formatted you can paste them into your SQL query to filter the records:

FROM [dbo].[GarbageLogs] g1
LEFT JOIN [dbo].[GarbageLogs] g2 ON (
    g1.identifier = g2.identifier AND
    g1.[date] < g2.[date])
WHERE g2.[date] IS NULL
AND h1.identifier IN (


Leave a Reply

Your email address will not be published.