Home    All Articles    About    carlos@bueno.org    RSS

What You See Is What You Touch

Touchscreens are a genuine Big Deal, but it's hard to appreciate how big. As we'll see below, touchscreens break two core as­sump­tions un­der­neath how we've de­sig­ned graph­ical user in­ter­faces to date. I think we've only seen the start of it.

The ment­al model we use to de­sign pro­grams as­sumes a "point­er" that has a de­finite posi­tion but oc­cup­ies no space. The grap­hic casual­ly cal­led a point­er is just a mark­er of where this ab­stract point­er is at any given time. It's a point­er to a point­er, as it were. The point­er moves along a con­tinu­ous path on the X,Y plane in re­spon­se to the user man­ipulat­ing some de­vice in real space. In other words, your com­put­er is an Etch-A-Sketch.

Fitts's Law (and more general­ly, Steer­ing Law) has strong­ly in­fluen­ced UI de­sign. It states, more or less, that bi­gg­er tar­gets are eas­i­er to hit. The cool thing is that it tells you ex­act­ly how much eas­i­er and what error rate you can ex­pect for a given size and dis­tan­ce. It's one rea­son why high-value click tar­gets like menu bars are placed at the edges of the scre­en: it gives them ef­fective­ly in­finite size be­cause the edge stops the point­er for you. It also ex­plains why deep hierarch­ical menus suck: the long­er the path you have to steer the point­er through, the more li­ke­ly it is you'll make a mis­take.

Com­put­ers went through a twenty-year per­iod when dis­plays be­came larg­er and de­ns­er while point­ing tech­nology be­come more ac­curate, but didn't chan­ge the basic model. We could pre­sent more in­for­ma­tion at once; tool­bars grew and sprouted palet­tes and rib­bons and sub­menus. The pre­sen­ta­tion of con­tent on scre­en be­came more and more faith­ful to the final out­put, under the rub­ric "what you see is what you get". Pro­gramm­ers came to as­sume that all of these trends would con­tinue, and pro­duced a host of guidelines foun­ded on the lit­tle dot that fol­lows an un­brok­en path:

WYSIWYT

Then cell phones mutated into real poc­ket com­put­ers. So far so good; we can dig into his­to­ry and new re­search on how to deal with small screens. It's going to be a pain try­ing to fit all the in­for­ma­tion peo­ple now ex­pect into a small­er area, but it's pos­sible. But these small screens are also touch­able. The union of what you see and what you touch breaks the point­er model and (par­tial­ly) Steer­ing Law: the "point­er" is no long­er a point and its travel no long­er con­tinu­ous. After fifty years of Etch-A-Sketch we get to play with fin­ger­paint. All of the im­plica­tions are not yet ob­vi­ous, but here are six or seven:

Steer­ing Law doesn't apply as strict­ly when the user can tap one point and then an­oth­er with­out hav­ing to steer through a pre­deter­mined path in bet­ween. This is de­monstrated by the new genera­tion of iPad apps that use pop-out menus in ways that would­n't work as well on a mouse-based sys­tem. The cost of a menu has gone down while the cost of scre­en real es­tate has gone up, pro­duc­ing a dif­ferent sol­u­tion.

Make touch areas more ob­vi­ous, not less. The user has neith­er real tac­tile feed­back nor a "hover" state to in­dicate that some­th­ing will happ­en when he taps a par­ticular place. Jacob Niels­en ob­ser­ves that iPad de­velop­ers are cur­rent­ly par­ty­ing like it's 1993, throw­ing all sorts of weird con­ven­tions into their apps while im­itat­ing print. Re­s­ist the tempta­tion. A hallmark of new tech­nology seems to be how it in­itial­ly im­itates or re­jects its pre­deces­sors, then simulates them, and fin­al­ly ab­sorbs them. At the mo­ment I'd say we're in bet­ween stage 1 and stage 2 with touchscre­en in­ter­faces. When you first get your hands on an e-reader with near-print re­solu­tion and "pages" you can flip with your fin­g­ers, it cer­tain­ly feels like an apot­heosis. Hey, you think, this is an ac­cept­able simula­tion of a book. The il­lus­ion frays when you en­count­er a newspap­er app that cl­ings to some of the more an­noy­ing con­ven­tions of paper, does bi­zar­re th­ings when you tap a photo, and of­f­ers no search. [0]

The im­por­tance of large click tar­gets goes way up on touch in­ter­faces be­cause we're using our big fat fin­g­ers in­stead of a geomet­ric point. This br­ings up a fact we can no long­er polite­ly ig­nore: some fin­g­ers are fatt­er than oth­ers. In­dustri­al de­sign­ers know all about an­thropomet­ric varia­tion. An­yone pro­gramm­ing touch in­ter­faces will have to, too -- or at least some, um, rules of thumb. A con­troll­ing fac­tor is the average adult male fin­g­er width, about 2cm. Female hands, not to men­tion the hands of childr­en, average small­er. In prac­tice, touch but­tons seem to be us­able for most peo­ple down to 8 or 9 mm, but not much small­er. [1]

Peo­ple with small­er hands are general­ly hap­pi­er with the software keyboards on mobiles, lead­ing some to speculate on a con­spira­cy of pointy-fingered elves. A friend of mine with very large hands has to hold it in his left and poke de­licate­ly at the keys with thumb and forefing­er. [2] It's pos­sible that one in­ter­face will not be able to ac­commodate every­one, and why not? We have different-sized mice and chairs and play­ing cards. It would be in­terest­ing to see how applica­tion de­sign­ers ex­peri­ment with con­figur­able but­ton sizes as they do with font sizes. Some software keyboards make com­mon lett­ers like E and T larg­er than the rest.

A re­lated pro­blem is how the fin­g­er and hand oc­clude other parts of the scre­en dur­ing in­terac­tions. Like those jerks in the front row of a movie theat­er, your hands get in the way at just the wrong time. En­larg­ing the thing being pre­ssed is a good wor­karound, but what about the rest of the scre­en? There are quite a few first-person per­spec­tive games that over­lay joys­tick con­trols on the main view. This kills per­ip­her­al vis­ion: when turn­ing left your hand co­v­ers most of the right side of the scre­en. Game de­velop­ers know this, and that's why you often find sever­al con­trol schemes to choose from while they test out ideas. I sus­pect that those kinds of con­trols will gradual­ly mig­rate to the bot­tom third of the scre­en.

More sub­tle chan­ges are hap­pen­ing with eye focus. With a full-sized com­put­er both eyes are focused on a single point about half a meter away. Poc­ket com­put­ers tend to be held clos­er and it's un­com­fort­able to close-focus for long per­iods. Also, the oc­clus­ion pro­blem is par­tial­ly sol­ved by de­focuss­ing your eyes, so parts of the scre­en bloc­ked for one eye are visib­le to the other. It's al­most auto­matic. If you know a fast phone typer you can test this out by watch­ing their eyes as they type, then block­ing the scre­en with your fin­g­er. Their eyes will turn slight­ly in­ward as they chan­ge focus.

That feels some­how wrong to me. Great touchtyp­ing is why peo­ple love hardware keyboards. A while ago I pro­totyped a Morse code "keyboard" for my mobile phone to see how it com­pared to software QWER­TY keyboards. It sounds funny, and it is part­ly a joke, but it's also the mini­mum thing that could pos­sib­ly work. With prac­tice you can get fair­ly good.

One thing the Morse ex­peri­ment taught me was that Fitts's Law didn't go away com­plete­ly. This is not ex­act­ly a new re­vela­tion, but there is al­ways some hot zone that is eas­iest to hit, uni­que to a given de­vice size and orien­ta­tion. On a poc­ket com­put­er in portrait mode the hot zone is at the bot­tom. In landscape mode it's the area on eith­er side, about two-thirds of the way up where the dit [.] and dah [-] but­tons are. On a tab­let the hot zone seems to be in the bot­tom right (or left) quad­rant. Even so, the im­por­tance of the edges is less for touch than it is for mouse-based sys­tems, be­cause a vir­tu­al edge can­not stop your real fin­g­er.

A new in­terac­tion model may also need to take into ac­count han­ded­ness and fatigue. With­in minutes of using the first vers­ion of my "keyboard" I found an an­noy­ing bias towards dits in the Morse al­phabet. On a teleg­raph the dit is three times fast­er than the dah so natural­ly Morse code uses them more. The sur­pris­ing part was how un­coor­dinated my left hand was and how quick­ly it got tired. I ended up putt­ing three but­tons on each side to balan­ce out the work and to re­duce the numb­er of taps per charact­er.

And then you have multi­touch, which knocks "mouse ges­tures" into a co­cked hat, pro­vided we can figure out how to use it ef­fective­ly. Pinch/ex­pand and rotate are very use­ful for con­troll­ing the "Z axis" per­pen­dicular to the sur­face. There are also apps to simulate sound boards, pianos, and of co­ur­se keyboards. In­teres­ting­ly, multi­touch doesn't break any de­sign para­digms I can think of. It re­places a lot of them like re­s­ize and rotate "han­dles". Swipe can be (and is) over­used but it's a natur­al re­place­ment for pagina­tion and scroll­ing. Using a two-finger tap as a re­place­ment for "right click" to bring up con­text menus seems to be an­oth­er home run. There are pro­bab­ly a lot of natur­al places for it as a modifi­er sign­al. For ex­am­ple, a draw­ing pro­gram might allow you to paint with two fin­g­ers and con­trol brush size by the dis­tan­ce bet­ween them. It's not an un­mixed bless­ing. Multi­touch makes it hard­er to ig­nore ac­cident­al input from the palm and edges of the hand, which means the user can't treat a touch tab­let as casual­ly as paper just yet.

Touch in­ter­faces re­move one of the last phys­ical bar­ri­ers bet­ween users and di­git­al data. In­stead of man­ipulat­ing data with a car­toon of a hand we con­trol with silly in­stru­ments, we can poke it with our real fin­g­ers. This is de­ep­ly satis­fy­ing in a mon­key kind of way. It gives pro­gramm­ers stran­ge new pro­blems and re­spon­sibilit­ies. We will have to be­come amateur in­dustri­al de­sign­ers just as we be­came amateur typo­graph­ers, li­ngu­ists, and psyc­holog­ists. It puts us much clos­er, physical­ly and em­otional­ly, to the per­son on the other side of the glass. As users touch our pro­grams, our pro­grams are touch­ing back.

Notes

[0] Stage 3 is when the new tech­nology stops try­ing too hard to simulate older tech­nology, and in­stead di­rect­ly addres­ses (or re­nd­ers moot) the un­der­ly­ing need, using its uni­que ad­vantages. Stage 3 is usual­ly gradu­al. My com­put­er still calls its background layer "the Desk­top", years after real desks as such (with draw­ers, pic­tures, clocks, in­boxes, etc) dis­til­led down to the humble table that holds my com­put­er off the floor. Every­th­ing else has been suc­ked in­side and trans­for­med.

[1] This kind of stuff is fas­cinat­ing, if you're the kind of per­son who's fas­cinated by this kind of stuff: "Thai females ten­ded to have wider and thick­er fin­g­ers but nar­row­er knuckles than the females from Hong Kong, Britain, and India."
-- "Hand An­thropomet­ry of Thai Female In­dustri­al Work­ers", by N Saengchaiya and Y Bun­terngchit, Journ­al of KMITNB, Vol 14, No 1, Jan 2004.

[2] An­oth­er friend of mine, who de­velops touchscre­en apps for a li­v­ing, tells me that fin­g­er size doesn't matt­er all that much. But I notice his fin­g­ers are rath­er poin­ty and elf-like...