And I would encourage you to read this (as it happens, there is some evidence - at least from my skim read)
And also:
psycnet.apa.org
These use data and statistical methodology as part of its validation process. I don't deny WG is flawed, and the article you post shoes some of the flaws but as a test most studies show some validity and have some level of prediction in terms of outcome. My personal experience, is formal logic learning did improve my average score, however this is obviously unsubstantiated.
Of course, I'll be fair to SJTs - they are considered a data-backed predictor of job performance:
Now, my bigger issue with SJTs is this:
1) you end up ranking a situation with responses that you wouldn't do - which would you least do. At this point, you are just picking what the firm wants based on the firm profile. In a sense that is a thinking skill, but not reflective of personality (in the way the test is designed).
2) on slided scales, there are never polar opposites. For example, you could rank "I prefer to plan my work where possible" and "I have the ability to respond to change and work under pressure."
Now, how would somebody who has great ability to respond to change, but personality wise prefers to plan where possible mark themselves? Do they mark themselves in the middle, which could indicate the same as if they are not strong at either, when they are strong at both?
3) the same person answering will get vastly different strengths and weaknesses every time. For me, I disagree with its validity as a predictor of specific personality traits. Overall scores, which firms use, I guess I can agree carries validity.
Anyways, I believe both studies (or at least the first) show inductive and deductice reasoning aspects carry the most weight. And not to sound like a broken record, but
Osborne Clarke had a better, more evolved way to measure deductive reasoning than WG (and is the type of deductive reasoning I'd say is more reliable).
The bad and good news, is that what either of us think doesn't matter, as we don't design the tests. However, what I can say is if it's helpful to anybody, is that the critical thinking book (and occasional podcast) did improve my WG scores. And you can disagree with the correlation, but it's simply what I experienced firsthand

.