All right, so with all that out of the way I wanted to try to get a little more complicated. First up, wouldn't it be cool if we could add a little asterisk next to all of the seasons where Henry Riggs made the All-Star Game? Okay though, how do we do this? It turns out, there is a league_history_all_star table that looks like it is pretty much exactly what you'd expect it to be. This time I'm going to list the SQL first before the results since I think reading through the SQL will make them make more sense:
Code:
SELECT * FROM `modern-ish-baseball`.league_history_all_star asg
where asg.all_star = 7620;
7620, as you'll remember from before, is Henry Riggs' player_id. In this particular table, "all_star" is actually the player_id. Yell at Markus, not at me!
Code:
league sub year aspos all_star
100 1 1956 33 7620
100 1 1957 34 7620
100 1 1958 30 7620
100 1 1959 27 7620
100 1 1960 26 7620
100 1 1961 25 7620
100 1 1962 25 7620
100 1 1963 18 7620
100 1 1964 10 7620
100 1 1965 13 7620
100 1 1966 9 7620
100 1 1967 6 7620
100 1 1968 3 7620
These aren't the actual column names; I truncated a couple to make the layout make sense. Anyway, it looks like we do need to add a "league_id = 100" to our where clauses and I don't know what to do with that "all_star_position" field, but anyway, I do know that all we really need is the year.
Code:
SELECT
asg.year
FROM `modern-ish-baseball`.league_history_all_star asg
where
asg.all_star = 7620
and asg.league_id = 100;
Code:
year
1956
1957
1958
1959
1960
1961
1962
1963
1964
1965
1966
1967
1968
As you can see, this is a pretty decent set. Riggs, who will certainly go into the Hall of Fame when his career is over, made the ASG every season from 1956 through 1968. He missed out this year because he was injured, though, and he also missed his first two seasons in the majors. So this works to test out because there will be several years where he'll need the asterisk but a couple where he won't. If he had nothing but ASGs in his career, you could accidentally write code that just always added an asterisk no matter what and... well, that would be pretty useless, wouldn't it?
In any case, this is the test subject I'm going to use. In real life programming you'd have several of these, you'd probably automate them (writing code that would evaluate the data you got back to confirm you're seeing what you're supposed to see), and then you'd have a QA department doing its own testing. But hey, for personal use, this should be fine...
All right, so this won't be the end result but just to test this out, let's tell our career batting report to only bring back years in which Riggs made the All Star Game... actually, no, first I'm going to get even simpler.
I feel like it's best when you're doing relatively complicated things like this to break things off into the smallest pieces you can, test the tiny increment you did, go ahead the next little increment test it to make sure that *that* works, and so on and so forth. This may sound kind of slow but trust me, this is way, way better than writing a whole bunch of code you think works only to find out that you made some very basic error in step 3 out of 50.
With SQL it's even worse because you can't really step through your code to figure this out, but I do the equivalent of this in "normal" code all the time. If anything it winds up speeding things up, in part because if you really get to thinking about breaking things down into the tiniest components you can, you free your brain to only having to have one small thing to think about instead of a much larger picture.
All of that is to say, let's bring back only... the year 1968. Let's just do one year for now; multiple years are a little tricky to hardcode, and we'll be able to bootstrap the final deal off of this anyway. I won't copy and paste the whole thing because y'all don't need to read all that but here's the relevant portion (there "where"):
Code:
where player_id = @player_id
and tm.league_id = 100
and split_id = 1
and bat.year in ( select 1968)
Code:
year team g ab r h 2B 3B hr rbi bb ibb k hpb sh sf sb cs gdp ba obp slg
1968 ATL 157 563 86 149 22 4 25 77 74 6 56 3 0 6 1 0 15 0.265 0.350 0.451
Totals 2075 7543 1305 2425 353 22 437 1270 1043 91 503 15 0 75 9 11 191 0.321 0.401 0.548
The important part there is that bit where I'm saying "bat.year in (more code)". That's basically saying, "if you can find the year in this set, include it in the report". In this case we only have one hardcoded year. It's not very useful! But... what if we took that code we were just working on - that block inside of the parentheses just wants a select statement - and asked the report to use that instead?
Code:
where player_id = @player_id
and tm.league_id = 100
and split_id = 1
and bat.year in ( SELECT
asg.year
FROM `modern-ish-baseball`.league_history_all_star asg
where
asg.all_star = @player_id
and asg.league_id = 100
)
Code:
year team g ab r h 2B 3B hr rbi bb ibb k hpb sh sf sb cs gdp ba obp slg
1956 ATL 125 454 70 142 21 4 20 64 58 2 32 0 0 4 1 0 13 0.313 0.388 0.509
1957 ATL 152 552 99 182 38 1 23 79 65 4 30 0 0 7 0 0 14 0.330 0.396 0.527
1958 ATL 151 570 98 192 29 2 27 94 77 3 42 0 0 5 0 0 10 0.337 0.413 0.537
1959 ATL 153 577 96 188 28 1 37 96 72 6 47 2 0 4 0 0 11 0.326 0.399 0.570
1960 ATL 140 499 79 173 25 2 24 72 74 12 20 0 0 3 1 1 9 0.347 0.429 0.549
1961 ATL 142 527 94 175 33 1 28 88 76 7 23 1 0 6 2 2 20 0.332 0.413 0.558
1962 ATL 148 540 103 188 26 3 33 92 76 7 21 1 0 4 1 0 12 0.348 0.427 0.591
1963 ATL 113 423 87 144 20 0 37 104 57 2 15 1 0 6 0 2 15 0.340 0.415 0.650
1964 ATL 85 314 63 108 17 1 16 41 39 2 20 1 0 1 0 1 12 0.344 0.417 0.557
1965 ATL 159 558 108 177 17 1 45 116 95 9 29 1 0 5 1 1 16 0.317 0.414 0.593
1966 ATL 157 573 111 191 21 0 44 119 88 9 38 0 0 4 1 4 14 0.333 0.420 0.600
1967 ATL 127 465 79 136 17 0 29 72 69 12 40 1 0 5 0 0 12 0.292 0.381 0.516
1968 ATL 157 563 86 149 22 4 25 77 74 6 56 3 0 6 1 0 15 0.265 0.350 0.451
Totals 2075 7543 1305 2425 353 22 437 1270 1043 91 503 15 0 75 9 11 191 0.321 0.401 0.548
I hope you can see what I mean by bite-sized chunks now! Just that where clause is now saying, "where the player is Henry Riggs, and where it's the Major Leagues, and for total splits only (no lefty/righty stuff... and... from years where another table says Henry Riggs was in the All Star Game". That's kind of complicated just to spit out, and the way you have to do it with SQL is with that nested SELECT statement, but as you can see, if you just do it one little step at a time, it's really not complicated at all.
Anyway though, we don't want to bring back *only* years where Riggs made the ASG, we want to add a star to the end of the year where that's the case. So, let's go ahead and yoink that nested SELECT from the WHERE clause and go up to at-bats. Adding an asterisk is pretty simple and we've basically done something similar with innings pitched:
Code:
concat(bat.year, '*') as year,
I'm not going to return show the entire data set; suffice it to say, every year has an asterisk at the end now. Also not useful but it's a start.
So... in order to make the asterisk appear on some items and not others, we've got to use an IF() statement. I looked this up on w3schools - I work with MS-SQL, not mySQL, and sometimes the syntax is a little different - and this is the format you have to use:
SELECT IF(500<1000, "YES", "NO");
So... anything inside of parentheses, we can nest. Again, do NOT attempt to wrap your brain around the entire thing! Anyway, let's add an if statement that's always true into that concat() statement above to make sure everything works right...
Code:
concat(bat.year,
if(true, "*", "")) as year,
Yep, as expected, it does the same thing as the last bit of code does.
NOW FOR THE BIG MONSTER REVEAL... let's take that statement we added to the where clause - all by itself, it essentially returns a "true" or "false" - and see what happens now!
Code:
concat(bat.year,
if(bat.year in (SELECT
asg.year
FROM `modern-ish-baseball`.league_history_all_star asg
where
asg.all_star = @player_id
and asg.league_id = 100
),
"*", "")
) as year,
tm.abbr as team,
LA REVEAL MAGNIFICO
Code:
year team g ab r h 2B 3B hr rbi bb ibb k hpb sh sf sb cs gdp ba obp slg
1954 ATL 70 243 23 66 10 1 4 29 21 2 28 1 0 4 1 0 6 0.272 0.327 0.370
1955 ATL 84 288 31 83 14 1 13 48 23 1 36 2 0 6 0 0 4 0.288 0.339 0.479
1956* ATL 125 454 70 142 21 4 20 64 58 2 32 0 0 4 1 0 13 0.313 0.388 0.509
1957* ATL 152 552 99 182 38 1 23 79 65 4 30 0 0 7 0 0 14 0.330 0.396 0.527
1958* ATL 151 570 98 192 29 2 27 94 77 3 42 0 0 5 0 0 10 0.337 0.413 0.537
1959* ATL 153 577 96 188 28 1 37 96 72 6 47 2 0 4 0 0 11 0.326 0.399 0.570
1960* ATL 140 499 79 173 25 2 24 72 74 12 20 0 0 3 1 1 9 0.347 0.429 0.549
1961* ATL 142 527 94 175 33 1 28 88 76 7 23 1 0 6 2 2 20 0.332 0.413 0.558
1962* ATL 148 540 103 188 26 3 33 92 76 7 21 1 0 4 1 0 12 0.348 0.427 0.591
1963* ATL 113 423 87 144 20 0 37 104 57 2 15 1 0 6 0 2 15 0.340 0.415 0.650
1964* ATL 85 314 63 108 17 1 16 41 39 2 20 1 0 1 0 1 12 0.344 0.417 0.557
1965* ATL 159 558 108 177 17 1 45 116 95 9 29 1 0 5 1 1 16 0.317 0.414 0.593
1966* ATL 157 573 111 191 21 0 44 119 88 9 38 0 0 4 1 4 14 0.333 0.420 0.600
1967* ATL 127 465 79 136 17 0 29 72 69 12 40 1 0 5 0 0 12 0.292 0.381 0.516
1968* ATL 157 563 86 149 22 4 25 77 74 6 56 3 0 6 1 0 15 0.265 0.350 0.451
1969 ATL 112 397 78 131 15 0 32 79 79 7 26 1 0 5 0 0 8 0.330 0.438 0.610
Totals 2075 7543 1305 2425 353 22 437 1270 1043 91 503 15 0 75 9 11 191 0.321 0.401 0.548
Okay, that's a lot of work for one measly character, I will admit. But imagine if you had to do 500 of those. I have to say that doing something like manually adding asterisks to players if they played in the All Star Game is a thing that's going to feel tedious to me by, like, the 2nd or 3rd player. And yes, I am 100% a person who will spend an hour designing a way to get out of 5 minutes of busy work. But seriously, even though this might seem like a lot, it's really, really not. Or maybe it kind of is but it's just a long list of easy things to do, done one at a time.
Next up... how about an awards column like they have on BBRef, where it might say "AS, MVP". UGH, this will already be a pain in the butt...