Wednesday, October 24, 2007

Google: Statistical machine translation

ArsTechnica has a mini review of Google's translation service. They have switched from the rules based machine translation that they used to have and most machine translation services use.

I remember studying linguistic rules and statistical machine translation methods in college. Like the article suggests neither one is great but they can work well enough for someone to feel their way to the actual translation. The linguistic rules approach parses texts into an intermediary state using former grammar rules of the source language. The intermediary text is transformed into the target language former grammar rules of the target language.

The main problem with the linguistic rules approach is that it is similar to taking a sentence and marking it a parse tree and rearranging it into the parse tree of the target language and then changing it word for word. Another major problem is the grammars do not do too well with slang since there may not be a direct translation. The other problem is one of syntax there may be structures missing from the source that are needed in the target. For example, to properly translate "I went to the store" from English to Russian one needs to know if I traveled on foot or in a vehicle, since that changes the verb.

The statistical approach basically uses an algorithm to weigh the probability of the part of speech and/or meaning of a word. The statistics can be modified with the help of volunteers marking up a sentence or providing a more accurate translation. Given enough corrections and a large enough corpora the system can improve. Google appears to be using their index of web pages as a potential corpora and users of the service as the volunteers instead of the usual college student looking for beer money.

Googles approach reminds me of a few journal articles on using web pages as an inexpensive means to develop a corpus. Most corpori are rather expensive proprietary collections of text of language in everyday use. The statistical approach seems like a no brainer for Google since they have a corpori lying around and harnessing users even a poor algorithm is bound to get better. The linguistic rules approach only gets better with the development of more elaborate syntactical and transformative rules. The only question is what took Google so long to figure this out?

Saturday, September 01, 2007

Justice for the Jenna 6

This is an important case and a sad commentary on race relations in America. Here is the link to the Justice in Jenna site so that you can keep up with events and get involved. For those who may not have heard about the case NPR has a good run down of the facts but I'll try and present them here.

In the small central Louisiana town of Jenna there was a large shade tree outside of the high school. White students would sit underneath it while Black students stayed close to the cafeteria. At an assembly a Black student asked if he could sit under the shade tree and was told he could sit wherever he liked. Three White students who were part of the rodeo team tied nooses under the tree. The school gave the boys in school suspension, but the Black students though that the punishment was too lenient.

The Black students lead by star players on the football team organized a sit in under the shade tree. The authorities were called and the district attorney told the children, "with one stroke of my pen, I can make your life disappear."

There were fights though out the year which escalated into the school being burned down but who was responsible was not determined. Robert Bailey (16) tried to enter a party accompanied by other Black students that was attended by Whites. He was beaten up by some of the White boys and no charges were filed against them. During the fracas he was hit over the head with a beer bottle by Justin Sloan, who months later was charged with simple assault and given probation.

At a convenience store the next day Bailey argued with one of the White boys from the party who ran to his truck and retrieved his pistol grip shotgun. Bailey ran at the the armed teenager and wrestled for the gun. Eventually getting the gun away from the boy and heading home with friends. Bailey and his friends were charged with theft of a firearm, robbery, and disturbing the peace. The white boy who pulled the gun wasn't charged with anything.

Justin Barker (17) was bragging to friends that Bailey had been whipped by a White man. He was attacked by Black students when he went into the courtyard. The first punch knocked him out and some of the boys kicked him in the head. The wounds were slight enough that he was treated, released and out that very night at a social function.

Six Black students were charged with assault but the D. A., Reed Walters, bumped the charges to second degree attempted murder. The first trial is over with the defense resting its case immediately after two days of the prosecution presenting the charges. Mychal Bell was found guilty by the all white jury and faces a possible 22 years in prison.

Fo anyone who has a hard time understanding let's make it simpler. A black kid asks for permission to sit under a tree on the campus of the public high school that he attends and nooses are hung from it. The kids who did it get a slap on the wrist. Some of the Black students decide to protest by sitting under the tree and they are threatened by the district attorney. One of the Black students and his friends try to get into a party and he is beaten up. He argues with one of the kids from the party the next day and he has a shotgun pulled on him. He wrestles the gun away and is then charged with theft and related charges for getting the gun away from the guy (the gun turned out to be unloaded but there was no way for them to know that while the gun was pointed at them). A white kid boasts about the "gun thief" getting charged and is then beaten up, which wasn't right but charging the kids with attempted murder is idiotic and spiteful. At most they should have been charged with a mutual fight or assault, give them a fine or probation. The boy didn't have any life threatening injuries and was able to amble on down to a ring ceremony after being so "viciously" attacked.

I could do the whole metaphorical thing with the tree of intolerance and the shaded truth. But a case like this is just depressing and a stark reminder of how short a distance we've come as a nation in 40 years. I guess the defendants should take consolation that 40 years ago they would have been swinging beneath that shade tree instead of being lynched by the legal system and the tree turned into kindling.

Monday, August 27, 2007

Building Blocs of the Gay Community

There was an article awhile back about the study the Equality Forum did of GLBT voting patterns in the Philadelphia mayoral primary. The interesting thing about the report is that gays and lesbians seemed to vote as a bloc for the eventual winner Michael Nutter. For those outside of Philly or who live here but don’t follow the local political scene, Nutter climbed up from next-to-last to first in a matter of months to win the nomination.

The methodology of the study was pretty interesting in that it necessarily relied on a number of key assumptions. It looked at the Census information for areas with large concentrations of self-identified same sex couples. Then looking at the poll results they were able to show that in areas with large same-sex couple populations Nutter received a plurality of the vote.

This is interesting for a number of reasons if one can take the interpretation of the statistics seriously. First it shows that the GLBT community can vote in a bloc to express political will. Used properly bloc voting can be a carrot or a stick to make sure more than lip service is paid to an issue. Used poorly and you wind up like the Black community nationally ignored by the Democrats until election time and trotted out by some Republicans as the boogeyman during elections but mostly ignored since there is little upside in trying to capture your vote.

For an example of how the bloc vote can go bad; Lee Atwater, Ronald Reagan’s political advisor made a really good point when describing the Southern Strategy as reported by Bob Herbert in the New York Times. (Copy and pasted from Wikipedia).

Atwater: As to the whole Southern strategy that Harry Dent and others put together in 1968, opposition to the Voting Rights Act would have been a central part of keeping the South. Now [the new Southern Strategy of Ronald Reagan] doesn’t have to do that. All you have to do to keep the South is for Reagan to run in place on the issues he’s campaigned on since 1964… and that’s fiscal conservatism, balancing the budget, cut taxes, you know, the whole cluster…

Questioner: But the fact is, isn’t it, that Reagan does get to the Wallace voter and to the racist side of the Wallace voter by doing away with legal services, by cutting down on food stamps…?

Atwater: You start out in 1954 by saying, 'Nigger, nigger, nigger.' By 1968 you can't say 'nigger' - that hurts you. Backfires. So you say stuff like forced busing, states' rights and all that stuff. You're getting so abstract now that you're talking about cutting taxes, and all these things you're talking about are totally economic things and a byproduct of them is that blacks get hurt worse than whites.
And subconsciously maybe that is part of it. I'm not saying that. But I'm saying that if it is getting that abstract, and that coded, that we are doing away with the racial problem one way or the other. You follow me - because obviously sitting around saying, 'We want to cut this,' is much more abstract than even the busing thing, and a hell of a lot more abstract than 'Nigger, nigger.'

The strategy is alive and well today one need only look at the “McCain’s Black Baby” phone calls in South Carolina during the Republican primary for the 2000 campaign. It was sleazy but effective, with an anonymous call votes for all candidates would be suppressed by those who would be offended that such an accusation would be made. But McCain lost the most because enough people would be disturbed by the alleged extramarital affair and/or the race of the woman involved. As a side note a similar issue was brought up about Nutter and Brady not being Catholic or Catholic enough, since the flyer endorsed Knox as being the one true Catholic anyone upset by the sleaziness likely took it out on him.

As the Atwater quote states candidates can’t be as blunt as they once were in trying to court a particular groups vote, at least when the way to do it is on the backs of another group. The McCain example notwithstanding subtlety is crucial. When politicians want to make political hay out of attacking the GLBT community they rarely come right out and say the Jerry Falwell line about 9/11 happening because of the gays, lesbians, feminists, and abortionists. Invectives like that will backfire; you say you’re against special rights and only for state’s rights. When someone brings up Loving v. Virginia and the possible precedent it sets for gay marriage with the 14th amendment, you say you’re against judges legislating from the bench and such decisions should be left to the legislature. As long as the voting bloc isn’t sufficiently large and the general public will take such answers you’ll get re-elected. The vocal voting bloc is itself a get out the vote tool for their opposition, which can be placated into a stable base.

A side note is that with the higher than average rates of Black and Latino voters attending mass regularly “family values” can be an effective wedge strategy. One needs look no further than the 2004 elections, opposition to gay marriage brought out the evangelical vote in large numbers. There are some obvious problems with this strategy though. There are only so many anti-gay laws that can be passed before you start to look a little mean spirited. The mean spirited bar is a moving target since the more out GLBT people heterosexuals know they tend to view homosexuality less negatively on average. The big problem is the statistic showing that negative feelings toward homosexuality are less among those under 40. Among those who will vote in upcoming elections speaking negatively, may backfire.

The election results are interesting because the five candidates were decent to good on the GLBT issues. One candidate, Dwight Evans, tried hard early on to cultivate the GLBT vote for his campaign but was dead last at the polls. The results may have a lot to do with the tangible legislation Nutter passed awarding domestic partner benefits to city employees.

The key to a successful bloc is to vote the issues and hold the candidates responsible for their votes. That’s how the NRA became such a force in politics. It’s a shame that the Logo “debates” were business as usual. We’ll get nowhere nationally kissing the asses of people who give us a kick in the shins in public. The Democratic candidates are better than the Republicans but not by much really.

Domestic partnership is a joke and like “Don’t Ask Don’t Tell” would only serve to keep us in limbo. Domestic Partnership is a denial of equal rights. Using the argument that all but two of the candidates have expressed that gay marriage shouldn’t be recognized because it would violate the religious rights of churches is a way to sidestep the issue.
It's not like people want babies to get married. Under the same argument Rudy Giuliani isn’t married since he divorced his previous wife and under Catholic tradition one can’t get divorced, therefore the state should not allow Giuliani to enjoy any of the rights afforded married couples since his living arrangement would scandalize some religious institutions.

The point is the Catholic Church has every right to deny Giuliani Communion and refuse to officiate his marriage since it violates their religious beliefs but the state could not deny his rights because we don’t live in a theocracy, the same principles could be used for gay marriage. Only Kucinich and Gravel approach this view, Logo should have called the front-runners in the Democratic Party on it. Sure the Democratic Party is the lesser of two evils in this case but as Eugene Debs once said, “The lesser of two evils is still evil.” Politicians will give the least that they can to get you vote; don’t give it away for free.

Sunday, August 12, 2007

It was a "Big Black Man"

I don't know why people even use variations of this excuse anymore. It makes them look not only like liars, but racist jerks who think that the general public is even dumber than they are. The excuse goes something like this: "I'm not to blame it was some Big Black Man who caused me to (insert unlikely sequence of events)." It has to be said that it has a history of working.

As is mentioned in the article Charles Stuart made good use of the "Big Black Man" excuse when he killed his wife. The Boston police were quick to pursue the preposterous story and charge Willie Bennet. After Susan Smith drowned her kids in a lake she was able to convince the country to be on the lookout for a "Big Black Man" who kidnapped her kids and stole her car. Of course Hispanic men appear to be the new dark skinned bogeyman as evidenced by the runaway bride.

Florida State Rep. Bob Allen would have done better to just plead out the solicitation charge. His tory that he offered 20 bucks and oral sex to an undercover officer because he was scared of the "Big Black Man" smacks of the Mandingo myth, the untamed "Big Black Man" understands nothing but violence and sex. The myth has been fostered in popular culture by pornography, obviously, music and television. Poulson-Bryant makes a decent point that the myth of the ubiquity of large penises among Black men leads to an idea of hyper-masculinity among Black men in the broader society. This idea of hyper-masculinity gives plausibility to the "Big Black Man" in the mind of the broader society. So much testosterone is flowing through their brains that they can't think clearly and only understand violence and sex.

Allen's excuse is rather odd when one considers that while hyper-masculinity has homo-erotic elements (just look at the cover of a hip-hop magazine) offering to give a robber some of the money from your wallet and a sex act would seem to increase the chances of getting hurt not lessen them. While homo-eroticism is part of hyper-masculinity in the Mandingo myth it is merely there as an underlying element more as a response of the observing male to the Mandingo who is sex personified. The hyper-masculine is hyper-heterosexual at least outwardly.

Allen's excuse also rests on the unstated racist assumption that the Blackness of a man further up the walk is equivalent to a gun or knife, a threat in itself that demands mitigation. According to Allen's account he made the offer of money and sex without provocation of what turned out to be a police officer. His excuse also has a unique twist on the gay panic defense. The gay panic defense had mixed success in the Matthew Shepard case.

Usually with gay panic the defendant says they went temporarily insane and had to kill the victim for coming on to them. It is a form of jury nullification basically saying that the defendant was within their right to prove their masculinity by killing the victim. In this case Allen turns gay panic on its head, "I had to offer him oral sex or he might have raped and killed me." Not to put words in Allen's mouth (poor choice of words) but this appears to be where his defense is heading.

If Allen is lucky he can find a jury racist and homophobic enough to reduce his sentence or perhaps let him off. I say homophobic enough since they would have to buy into the stereotype of gay men that they will have sex with anyone and if possessing enough power will rape smaller more fragile men, the prison myth. While their obviously is some rape in prison there are obviously some Black men with above average penis size. The two generalizations are combined in the "Big Black Man" myth to form a hyper-sexed, hyper-violent, thug who has a huge penis that he'd love to stick in the White man. It's miscegenation and homosexuality wrapped in a bow looking for the right jury, one that is bent on taking a stand for "traditional values" against a railroading Democratic government. Allen is taking a big risk because he is going to alienate not only the Black and gay vote, who probably wouldn't have supported him in huge numbers anyway, but also some of the conservative backbone of his constituency who may just stay home.

If the Republican party is lucky Allen will plead out before the next election gets too close. One congressman's hypocrisy has a quick way of branding the whole party, see how fast the Democrats dropped support of Jefferson. The Republicans seem to be playing it right so far, let him hang himself. If McCain is lucky no one will remember that this guy was his man in Florida. It doesn't help to appear to have a record of appointing felons when you're making a run for the White House. Once you get in you can spin this type of situation like a top. Every president in the modern era has had some questionable people in their cabinet but you don't want to start off with the plausible deniability game before the inauguration. To be honest Lyndon LaRouche has a better chance than McCain of winning the White House.

The Democratic party will probably just sit back and bide their time. This isn't a huge national issue but can provide a regional opening if they let Allen commit political suicide. They don't have to worry about the body they'll just blame the "Big Black Man."

Saturday, June 09, 2007

Pop-up teacher reprise

The teacher who was facing 40 years for pornographic pop-ups in the classroom is getting a new trial. This is good news in my opinion since she was railroaded. It's understandable as soon as you mention sex and kids people's minds shut down. It's not right but understandable.

Another teacher had a similar incident this week. Fortunately she wasn't hauled into court. As is mentioned in the article the likely cause was the duplication center dubbing the educational material onto a porno tape.

Saturday, March 03, 2007

Top of the pops

Following a post on Angry Asian Man, I saw The New York Times has an interesting article on the challenges Asian American singers have in trying to break into the music industry. This could be seen on American Idol this year when Paul Kim was given the boot in the first round of voting. Despite being one of the best singers he couldn't overcome the fact that he was Asian.

The article goes on to talk about how racially ambiguous singers fair better. As someone who grew up during the 80's I could only think of how African Americans crossed over onto the mainstream charts. Creating a parallel market where acts such as Diana Ross and Michael Jackson could eventually break through. Performers such as Prince could start their careers with the mainstream in mind. Of course as the article states the Asian American market is about half that of the African American market. The Asian American market also has the problem of not being uniform because there are many subcultures and backgrounds, Japanese Americans or Korean Americans for example.

There is still hope since there is so much talent waiting in the wings there just need to be that one big hit in the mainstream. As soon as the mainstream audience is used to seeing an Asian singer in their living room it gets easier for the next act, but never easy. Nat King Cole having his own show paved the way for the rappers and singer one sees today. The irony of it all is that because of the stereotypes that many have of Asians and the market realities the first big Asian star is probably not going to be Asian American.

The recent interest in J-Pop crossing into the mainstream may eventually translate into more tours and CD sales (or downloads if the regionalization of iTunes and other stores ever gets looser). Because of the stereotype of a fifth generation Japanese American as being straight off the plane from Japan this interest could be used to promote domestic singers and bands. A bit of a long shot but waiting for the industry to judge people on talent and not the color of their skin or the shape of their eyes is going to take longer and a lot more luck. Just look at Living Colour or Bad Brains two of the best heavy metal/fusion and punk bands respectively but they've never really gotten as far as white bands with less talent. The British invasion is not a direct parallel but is a study in the sound and look of an American subculture being ignored until it is repackaged and shipped back from another country. The mainstream market likes to fit people into niches Asians do classical and Blacks do hip-hop and never the twain shall meet.

Monday, February 19, 2007

An epistle on epithets part 1

I’ve been reading Covering by Kenji Yoshino – well I actually bought it awhile ago but got sidetracked – and I’ve been thinking about the problems some celebrities have had with epithets. Mel Gibson, Michael Richards, Isaiah Washington and most recently Tim Hardaway have run into some difficulties for using slurs. The religious, racial and sexual derogatory terms that were used by Gibson, Richards, and Washington / Hardaway respectively have force mainly in how they differentiate the target from the “norm.”

This categorization and classification as being different grants the more “normal” or “ideal” among us power – to greatly summarize Foucault – in the form of the gaze. The epithet is in a way the verbal expression of the gaze; it allows one to point to those who have not successfully assimilated themselves as being freaks outside of normal human discourse. It is a means to objectify the targets of the gaze and the epithets subjugating humanity and reducing them merely to the epithetic difference.

I’ve been the target of all three of the types of epithets that the above-mentioned celebrities espoused as have several others. The most recent controversy over “Grey’s Anatomy” star Isiah Washington’s use of the “F word” struck a nerve because so many people trotted out the same old tropes. First some people I know who shall remain nameless – who know that I am gay, mind you - said that it wasn’t a big deal because he was using the word to deny using the word. T. R. Knight, the person he was ostensibly referring to with his comments stated that Washington said them in October during the big kerfuffle. The brouhaha forced Knight out of the closet. This is the “you people have always been so thin skinned.”

The other trope is the old “some of my best friends are (insert oppressed class I just insulted)” which Washington brought out when he brought up his role in Spike Lee’s “Get on the Bus” as a Gay Black Republican. While it is true he was a poster boy for Mary Cheney, PFLAG and others it doesn’t give him a free pass on the use of epithets. If it did Richards could have just pointed to Kramer having an African American attorney after his outburst.

The other trope is the old “ruler contest” that is trotted out every time you have one person from a minority insult another minority. I saw this when some people jumped to Washington’s defense saying that if he is fired it is a sure sign of racism on the part of the producers. The reasoning works like this, Blacks have suffered through slavery, segregation and are still given less pay and opportunities in professions like acting so therefore the “F word” is bad but not as bad as the “N word” so Knight and everyone else should get over it.

The last trope that I’ll bring up is the “but you say it” argument. While I’m a Black Gay Man I try to avoid using the N-word or the F-word, because they have a dark history attached to them. Other people believe that they should be reclaimed, they tend to forget that reclaiming in the modern age means commercialization. When you commercialize a word it goes beyond the confines of the group. By using the words in pop culture it implicitly gives permission to people not of the effected communities to use them. Hence, “why can 50 Cent say it and I can’t?”

While it is true that the overwhelming audience of hip-hop is suburban Caucasians, one has to wonder why so many of them have the urge to use the N-word to show that they “are down with their boys.” I’ve never felt the urge to sling a few anti-Semitic words at my Jewish friends to show my affection.

The main problem is that on the one hand people argue that these are just words and they have no real power and on the other they show the power of the words by pleading to be able to drop them casually in polite conversation. If the words are not meant in a harmful manner then why insist on using them when others say that they are harmful to them.

For those who still think that people are just blowing things out of proportion and wish to affect a more laissez faire attitude in their speech they should try a simple experiment. The experiment goes like this replace their speech with the entire hip-hop lexicon, not just spinners, n****s and f*****s but b*****s and h*s as well. Do this regardless of the audience or to whom the term applies. For example a man should refer to his girlfriend as “this is my b***h, she’s chill wit’ whatevah.” No one who had any respect for his girlfriend would say something like this. Why would it be acceptable to say we can dance like some n****s (a la Paris Hilton)?

Sunday, February 18, 2007

Pop-ups and jail time

A substitute teacher is facing 40 years for porno pop-ups during class in CT. This is despite the PCs running an unpatched, version of Windows 98 without antivirus, pop-up blockers, or spyware blockers. The teacher is being thrown under the bus by the school district on this one because parents want their pound of flesh. The only other alternative is to point the finger at the IT staff of the district for not doing a better job of network security. To be fair locking up the teacher or the staff in this case seems to be going a little too far. The children being exposed to pornography wasn't the intent of anyone involved in this case.

One thing that should be considered in relation to school libraries and even public libraries is the balance between privacy and security. One of the problems of large institutions is the desire to maintain a homogeneous environment and lack of disposable funds. Because of this the lowest common denominator in security is usually what is maintained. At the very least, a browser with a built-in pop-up blocker should be the default, antivirus and anti-spyware software has to be on every computer no matter what the operating system.

I'm hoping that she will get an appeal with a judge that actually understands that pop-ups are not sought out. I'm not holding my breath.

Wednesday, January 03, 2007

The Becky

The Language Log Blog just announced the winner of the Goropius Becanus Prize. More accurately, Jeff Nunberg announced it on Fresh Air. The award goes to a person or organization that has "outstanding contributions to linguistic misinformation.

I'd have to agree that this is a very good choice. I found the assertions that I heard in her radio interview on Radio Times to be just someone trying to cash in on stereotypes in disregard to actual scientific studies. For the sake of convenience here are the search results for articles talking about the "scientist" on the LLB.

The fact that she got so much attention and that actual linguistic research doesn't get too much of love is probably due to the preference for sound bite research that affirms stereotypes. The Northern City Vowel Shift didn't exactly burn up the phone lines on the radio. Still people have an interest in the sideshows of linguistics and science, researchers just have to do a better job of bringing people into the real show. A couple of letters to the editor or some such to say, "that was interesting load of crap but the true research is even more fascinating." I mean the fact that most studies show men talk just as much or more than women raises a number of interesting questions about gender roles and power. The stereotypes are not only wrong but pretty boring.