Gesture programming for the iPad // Metamuse podcast episode 5

Metamuse Episode 5 — May 26, 2020

Gesture programming for the iPad

Developing an iPad app with a rich gesture space and unique spatial-zooming visual model is technically challenging. Julia joins Mark and Adam to break down the software engineering behind Muse.

Transcript

00:00:00 - Speaker 1: Also something that makes it very unique is this like you’re you’re basically floating through space and you’re zooming deeper into your hierarchy and all of this is like a perfect illusion of seamlessness when it’s actually not seamless at all.

00:00:22 - Speaker 2: Hello and welcome to Meta Muse. Use a software for your iPad that helps you with ideation and problem solving. But this podcast isn’t about Muse, the product, it’s about Muse, the company and the small team behind it. My name’s Adam Wiggins. I’m here today with my colleague Mark McGranaghan. Hey Adam, and my colleague, Julia Rogats. Hi, Adam. And Julia, you have now made 2 have 2 years in a row to spend the entire winter in a sunny location away from your home in Germany. How’s that working out for you? You can repeat that again next year?

00:00:56 - Speaker 1: Oh yeah, I mean, I guess we’ll see about next year and what traveling is going to be like in the future.

Um, but at least for the past 2 years, I’ve really enjoyed that. I think, I mean, I love my hometown, Berlin, um.

And I love being here in the summer, but in the winter it can get quite gloomy and dark and cold, uh, and I’m very much a sun person, so, um, yeah, I’ve really been making good use of this remote company set up and you know, make your own work hours for the most part.

So spending lots of time in Adventurous places, kind of splitting my workdays in half, which is something that I really like to do, get some work done in the morning and then do something nice outdoors and then work some more hours in the night.

Um, it’s been really been a really nice balance for me throughout the winter time.

00:01:42 - Speaker 2: You have a very impressive ability to get stuff done while also interleaving it with adventure. You’ll, you’ll ship some major new feature and then go whale watching.

00:01:57 - Speaker 1: And then fix a bunch of bugs and then go kayaking or I’d be like, guys, I’m going to be 20 minutes late from the meeting. I’ve just got back from a scuba dive.

00:02:04 - Speaker 2: That’s yeah, absolutely. But it’s also a reflection of the kind of work environment we built. Mark and I talked about this on a previous episode of trying to make a space that is flexible. For all of the the people on the team to live the kind of life they want to live. And for you apparently scuba diving and uh whale watching and kayaking is is the life you want to live.

00:02:23 - Speaker 1: Yeah, it’s definitely been amazing to not have to separate your life so much between like work and traveling. Like usually traveling for me always happened on vacation, um. And I actually find the mindset that I’m in uh when I travel, when I’m in a different country to be extremely stimulating in many ways and actually that to make me more productive. So being able to mix that has been quite a blessing.

00:02:47 - Speaker 2: So our topic today is iOS development, and then from you specifically, kind of our gesturerer system and why that’s so challenging to implement.

But I thought maybe for contexts for people that don’t know how IS development works either because they know about software development generally, but not necessarily kind of mobile development, or even people who aren’t necessarily that familiar with how software gets built. They might like to know, what does it look like for you? You sit down in the morning or maybe the afternoon to work on some features or fix some bugs, you’re going to start crafting muse out of artisanal ones and zeros. What does that actually physically look like? What devices are you using? What software are you using?

00:03:31 - Speaker 1: Yeah, so in terms of devices, I use a MacBook first and foremost, as far as I know, you still can’t develop iOS or Mac software on any other platform. So that’s where everything starts and it comes with the with the IDE basically to develop for the iPad or iPhone, which is called X code.

00:03:51 - Speaker 2: IDE being integrated development environment.

00:03:54 - Speaker 1: Yes, correct. Uh, so basically it’s kind of the entire tool kit that you need to write software for the iPad or the iPhone. You write all your code there, you compile it there, you debug it there.

So what I usually do um is that I plug in the actual physical iPad.

The XO also comes with a simulator and you can run all of your um iOS apps in the simulator itself. So basically just brings up a little screen on your computer that looks like an iPad or an iPhone and you can do most things there.

But for an app like ours, which is extremely gesture driven and we use the pencil for many things, it’s a bit tedious to actually um work with the simulator and some things aren’t possible at all. So I work with the physical device plugged in.

You can actually also build to it wirelessly as of a couple of years ago, but it is a little bit unstable, so I try to just depend on the cable there. Um, and yeah, then I just write some code, like click one button and then it runs on the device and then I can test everything there.

00:04:59 - Speaker 2: And this is the SWIFT programming language. Uh, we’re storing our data, or sort of the persistence layer is core core data. Do we use any other fancy libraries or APIs or is it mostly just kind of the Apple gives you a pretty complete kit for development, everything from the editor through to the language and all that stuff, the simulator like you said. Uh, whereas like I come from Mark and I actually both come from more of a web development background, there you’re putting together more mix and match, uh, the tools, the language, and the different pieces. But here you get this one kind of, it’s the Apple style thing, you get this one pretty complete kit.

00:05:33 - Speaker 1: Yeah, pretty much. Um, so I think.

It’s fairly rare for an IOS project to have no like zero dependencies to any sort of third party libraries, but ours are actually quite minimal. I think we have something in there, for example, for like zipping and unzipping files. That’s something that as far as I know is not built into the IRS kind of standard library.

But for the most part, really like the IOS SDK is extremely comprehensible. You can do all kinds of things with it. They over the years they’ve added um much more stuff, especially from kind of open source third party frameworks that were very successful, have often been integrated in one way or another into the um IOS ecosystem or they’ve basically rolled their own, their own version of it. So our dependencies on on external frameworks is actually quite small.

00:06:28 - Speaker 2: And at one point we were doing the, maybe this is back when Muse was still a lab project or a persistence layer was Firebase, which is this kind of mobile back end data service from Google. Um, what was our, I think you like we like that pretty well, developer experience wise, but what, what led to us kind of replacing that with the Apple standard on device storage?

00:06:49 - Speaker 1: Well, I think the main motivation here was that we basically didn’t want to be dependent on Google and kind of giving giving our users data um to be stored on Google servers. So I think that was that was the main motivation.

00:07:02 - Speaker 3: Yes, speaking of Sending or not sending user data to Google. I’m really proud that we don’t have any third party analytics libraries integrated into Muse because these are notorious for scraping all kinds of data and sending it to a bunch of third parties. You saw this recently with Zoom, for example, where they had, I think it was the Facebook SDK integrated and apparently unbeknownst to them was sending all kinds of user data to Facebook, presumably for advertising purposes. Um, so I think that’s a really healthy thing that we have with our current minimal dependencies.

00:07:30 - Speaker 2: We do have analytics, but this is a, a system built by you or, or it’s sort of a roll our own type thing.

00:07:37 - Speaker 3: Yeah, and it’s it’s extremely minimal and deliberate. So every single field, which is like basically like 3 or 4 that we send this analytic service, are handpicked by us. It’s in our code, it’s it’s explicit versus a dependency that’s updating every week and it’s scraping new random things from the OS and sending it to third party servers where you have no control over it.

00:07:58 - Speaker 2: Mark, you end up building the back side of things. Ya, you do the client side of things. How do you coordinate around that API? How do you, how do you figure out how to make those two ends meet?

00:08:08 - Speaker 1: I think um for the most part, it’s been pretty lightweight. We chat on Slack about what’s needed for a certain thing. Um, often Mark ends up kind of drafting a notion document or something that like API docs or design specification kind of thing.

00:08:24 - Speaker 3: Yeah, exactly. So, so typically these notion docs will have first the mental model, which I think is really important, like what’s the shape of the domain here, what are the key objects and key verbs, and then a sketch of the HDP API which again is usually very simple, and then a discussion of the behaviors that are behind that.

00:08:41 - Speaker 1: And then as soon as we get into implementing that, um, it’s usually we end up being online around the same time and I’m telling him, OK, I’ve just implemented this API. Uh, is it deployed yet? Can I, can I start hitting it and then I’ve just, you know, depending on what it is, I send some sort of event and mark checks in the logs if you see if he’s seen the right thing and you know, often there’s a few things from there that we need to fix like something is not encoded in the right way, but we basically just tackle that together via Slack or a video call.

00:09:12 - Speaker 2: Be just to round out the tech stack discussion since we referred to the front end there with, you know, SWIFT and core data on the back end we’re basically doing Ruby postgrass and Hiroku, which for Mark and I is kind of our very standard tool kit.

I think they say, you know, we came out of this research lab where our goal was to push the boundaries of technology and what what we can do there and try lots of Weird and interesting cutting edge things. But once you have, once you’re moving into the realm of production and commercial products, they say, choose boring technology. Choose the boring things that are workhorses that have worked really well.

I’ve used Postgrass, for example, for, I don’t know, now 15 or 20 years, um, and there’s always a shiny new thing, but the stuff that’s really reliably and the stuff that is performed reliably for you for a long time is often just the thing to do.

00:10:02 - Speaker 3: Yeah, I’m really happy with our back end stack, and of course, Hiroku, but also Postgrass in particular, such a great database, super rock solid, super flexible, and now we can use it for both our sort of online um data as well as our analytics data.

00:10:16 - Speaker 2: Yeah, and a quick shout out on that kind of from the product perspective to data clips, which is a little way to bundle up a SQL query in a form that you can share it as a, um, as a web page. We use that quite a bit as our kind of our ad hoc analytics sharing system.

Right, well, let’s get into the media part. I hopefully that gives some good context for um technical or um less technical folks about exactly what the pieces are here.

Now getting into something that is pretty, and all of that I think is fairly sort of standard stuff that you might see in a in an iOS app or an iOS app that has a small back end. But getting into Muse, which is trying to really push the boundaries on what you can do with a tablet app, with these unique gestures, the different treating, treating the pencil differently from the the hands, that there’s multi-handed gestures and all this. So we have quite a bit of both design and engineering effort that has gone into our, our gesture system. But maybe we can start at the very beginning. Julia, what is a gesture?

00:11:20 - Speaker 1: A gesture is uh it’s a good question actually. I don’t think I’ve ever defined that for someone. Um, in terms of IOS development, there’s actually a whole system around gestures and gestures can be of one or more categories. So there is a pen gesture which would be just setting your finger down on a screen and moving it somewhere. You might be actually touching an item that you want to drag along, but you can also, you know, pen for any other reason, for example, to draw something. Then there are things like swipe gestures, which are also a pen in a way, but they’re like distinct here like just flipping through pages. Then there’s scrawling, which is a more of a continuous leaving your finger and scrawling something. There’s a scrawl gesture, um, there’s pinching, which is sort of you’re zooming in and out of of things and there’s a whole bunch of uh other gestures that you can. You can combine in your app to achieve different things, but they usually triggered with your finger or in our case or in some other apps cases also with a pencil.

00:12:22 - Speaker 2: Yeah, probably from a user perspective, you don’t even think that much about something like a tap, a double tap, a swipe, a pinch.

These all part of the magic and the beauty, I think of multi-touch screens and why they’ve um Sort of taken over the the world in terms of interfaces, is that they do seem so natural, and it seems so obvious, the difference between, for example, a swipe, a scroll, and a pinch.

But in fact, it’s quite a bit of logic to um make sense of that stuff.

And I have experience with sort of mouse, um, wouldn’t call them gestures, but basically interpreting what the user does on a desktop computer with a mouse, um, in my past life as game developer, and There things are actually a lot simple because you’re a lot simpler because you generally have the X and Y position of the cursor and whether the buttons are down. And there is a time element for some things like double clicking, but it’s pretty minor. Most things are really discrete.

Uh, the thing that I think really opened my eyes on this was, um, we both were at UIO last year where you gave a talk. And another talk there was, uh, Shannon Hughes, who worked for Omni Group. They make the some great productivity tools like Omnigraphle and Omnifocus. And she had worked on, I think the iPad app for one of these, and had done gone pretty far on these um these gestures and even has written an open source library for basically making a diagram. And she showed this, these kind of these gesture disambiguation diagrams, uh, in real time and you could see that actually this, there’s this huge time component where what makes a gesture a gesture is not a discrete moment in time. It’s a collection of positions and You know, touches in different places and movements of those touches over time and the accumulation of those things eventually resolves itself into the system deciding, OK, I just saw a pinch.

00:14:17 - Speaker 1: Yeah, exactly. And gladly we’re getting pretty much.

All of that for free from the iOS SDK.

So you could, if you wanted to and you, you know, you had the time or which is an interesting experiment for you. You could actually write all that yourself, so you can get just very raw touch input events from the system. If you have a screen. You can basically just implement a couple of methods that will fire whenever a finger goes down and moves somewhere just with a position and nothing else. And you could go from there and build your own, you know, this now. I think these fingers moved apart from each other, so it must be a pinch out. But um gladly the folks at Apple have gone through all of that work for us and uh developed this concept of a gesturerer that you can just attach to any view and that will make that view respond to specific gestures, for example, a pinch and just notify you when when that gesture first starts and then when it changes and also give you for a pinch, for example, it’ll give you the scale. So it starts out with a low scale and then As you pin, as you move your fingers further apart, the scale value will change and it will just notify your callbacks and uh then you can zoom or do whatever, whatever else you want to do with that pinch.

00:15:33 - Speaker 2: Now, if I was to look at the raw data, and I think I’ve seen test programs that do this, the screen or the the system that’s reading these touches, of course, doesn’t know which finger I’m setting down.

So the difference between, you know, for me, where I can see my hand, it’s pretty obvious that if I, if I put down, for example, my thumb and my index finger near each other and move out, you know, that looks like a pinch gesture or put them down further apart and move in, that looks like a pinch. But the difference between doing that with my thumb and my uh pointer finger versus doing it with each thumb on each hand, which you could totally do. But the system can’t tell any difference. It just to text touches in certain locations and then those touches start moving.

00:16:17 - Speaker 1: Yeah, exactly. And that’s actually what makes everything so complicated that we’re trying to do.

In fact, a pinch is even recognized when fingers only move. By only a very few pixels.

So one example that I can give from from our app where this was a bit of a puzzle that we had to solve is we want to allow two fingers scrolling on a board. That means you sat down two fingers and you move them in, you know, either to the left or right to scroll the board. But we also have this sort of global pinch gesturerer that listens to you pinching out to zoom out back to the parent board. And that gesture is triggered by, or at least in the past has been triggered by even the most minimal movement. So we wanted to build the app in a way that is, that it’s super fluid so that it responds to your touches right away. That means that even if you set two fingers down on the screen and they converge by maybe 5 pixels towards each other, the system will consider that a pinch and will immediately start the zooming transition. So when you’re actually just using two fingers to try to scrawl, there’s basically no way that you can, you know, you’re not a robot, you’re not gonna be able to keep them completely parallel to each other. So we had to add a bit of custom disambiguation logic where. Pinche is only triggered after the fingers moved, you know, maybe by. a scale of 1.1, um, so by, you know, more than 10 or 20 pixels depending on, on where you started with your fingers, and that adds a little bit of delay to the system, you know, actually responding to your actions when you do want to pinch, which is a trade off, obviously, but it’s basically the only way that you can make these two gestures work together, um, and dis disambiguate them in some way.

00:18:13 - Speaker 3: Yeah, this delay issue is really interesting.

One of our top level design goals for you is that it’s super fast and responsive.

So the idea is, as soon as you touch the screen and do something, the app should respond. So you always feel like you’re directly manipulating your content. And as Julie was saying that’s really hard with these gestures that are potentially ambiguous. And in some cases we’ve taken this approach where you Uh, just try to have a very small delay, basically imperceptible delay that allows you to disambiguate. I mean, that seems to work pretty well. Another approach that I’m excited about trying is actually doing both optimistically, and then retroactively picking one once the disambiguation becomes more clear and rolling forward with that and unwinding the other one. So you can imagine with this pinch of You start doing a pinch last scroll. It’s ambiguous and it basically starts zooming imperceptibly and scrolling imperceptibly. And then once it becomes clear that you’ve done one or the other, it unwinds the thing that it wasn’t, you know, zooms out slightly, for example, and then keeps doing the thing that you were doing, scrolling, for example.

00:19:15 - Speaker 1: Yeah, we’re actually already doing some of that um in a similar, in a similar problem.

So the same way that I was just talking about you can two fingers scroll anywhere on a board. Um, you can also drag any card on any board with one finger, and we deliberately, as you just pointed out, we deliberately wanted to make that instant. So most apps work in a way where you hold your finger down on something and then it sort of enters like maybe slightly lifts and enters into a movable state and then you can drag it around.

Um, and that’s exactly the thing that we didn’t want, and I think one thing that makes news very unique that is like ultra responsive.

So as soon as you set your finger down a card and you start moving it, you can even have your finger do a movement as you set it down. The the cart will start moving with you.

And so the problem with then the two fingers scrawling here is that when you do want a two finger, you do want to use two fingers to scrawl, and in that case, we don’t want to move that cart as you sat down your two fingers, inevitably one of the two will set down first because again we’re humans, not robots. So even if it’s just a fraction of a second, that first finger that comes down and moves by one pixel will trigger the car movement.

But then the other finger comes down and then the system actually recognizes, oh, it’s a scroll, and it actually cancels the car movement. So you might sometimes if you do it very fast and if your, your first finger goes down noticeably earlier than the other one, you will see your your card start dragging and then jumping, kind of animating back into place where you picked it up from and then the scrolling kicks in. So we’re using that trick already a little bit in the app, but it’s quite cumbersome to implement that. So I hope, I hope eventually we’ll have more of a unified approach for this kind of thing.

00:21:07 - Speaker 2: Can you talk a little bit about what the overall framework here is? Um, is it essentially a giant case statement or a series of statements or is it more of a state machine or what does that what does that look like?

00:21:20 - Speaker 1: Yeah, it currently isn’t really. Uh, very cohesive system, um, because of how some of the components interact. So you still want to be able to kind of give individual components the the ability to control themselves basically without writing this this global gesture handler.

00:21:41 - Speaker 2: By component to control itself, here you’re talking about. That there’s not one entry point for someone to touch the screen. It’s more you want to attach a, a snippet of code or a piece of functionality to say a card, and it, it sort of knows, so to speak, how to, um, how to manage touches that it it receives, and that can be somewhat dependent from what another card does.

00:22:04 - Speaker 1: Yeah, exactly. So for the cards, actually, um, we do have a bit more of a global approach because of how much the card dragging interacts with other things like zooming in and out of boards while you’re dragging a cart along.

00:22:16 - Speaker 2: So is this the maneuver?

00:22:19 - Speaker 1: Yeah, this is the maneuver that made everything so difficult for us.

00:22:23 - Speaker 2: OK. Well, in the backstory here is it’s pretty critical, right? There’s, you know, if you’re inside a board and you have one or more cards you want to take elsewhere.

You can, um, you can stick it in the inbox. There’s kind of, you know, maybe you can use copy paste, but that’s kind of a hassle. Really, what you want to do is grab it and then navigate to your new location. And in fact, that’s how it works. Sometimes we call it the two-handed card carry. So you can put your finger down, you’ve kind of picked up, so to speak, that one, and then if I pinch out with the other hand, I’m essentially now I can freely navigate around and I kind of keep this other card in this floating state. Um, but that’s the thing that doesn’t work if the gesture handlers attached to the card itself.

00:23:04 - Speaker 1: Yeah, exactly, because the gesture, uh, in, in order to be able to carry the card into a different space, we basically have to detach it from its parent.

So before I was living on this board, and then if you had a gesture recognizer attached directly to the card, you can move it around the board, but as soon as you put it uh to a different parent, The gesture uh recognizer actually cancels and you basically lose that gesture. And so in our case, in order to be able to carry it to a different board, we basically have to put it um on the top level hierarchy, basically attach it to your window. So that you can zoom potentially many levels deep or or further up your hierarchy um until you find the board where you want to put that card and let go.

00:23:52 - Speaker 2: So many of these things that are challenging is because Muse does come from a a different set of product design principles.

And one of them is this certainly the spatial zooming um interface, but also that we want to maintain this illusion of a continuous fluid space.

Um, I think with many other kinds of applications, you have this sense of going to different screens or different pages, and you know that when you go to, when you, when you navigate to that new screen or page, all of the kind of stuff that was on the previous screen just goes away or isn’t relevant in this new place.

And I think that’s fine for a music player or something like that, but what we’ve tried to create this space where you have this big workspace and you can move stuff around freely between it. But then the kind of libraries and the APIs that come with, certainly the iOS system or I think any kind of UI system is just not built, uh, assuming that, assuming you want to do something like that.

00:24:47 - Speaker 3: Yeah, this is, this is an aside, but I really like that there’s no loading screens in Muses. You don’t open documents or load them, they’re just there when you look at them, and that seems obvious, but when you go back and use an app where you’re constantly loading documents, waiting for them to open, it’s just a totally different experience. So I think it’s worth the effort that we go through on the technical side.

00:25:07 - Speaker 1: Yeah, and uh you know, not, not least of that um the the sort of challenging model that we chose for Muse, which is also something that makes it very unique is this like you’re you’re basically floating through space and you’re zooming deeper into your hierarchy and all of this is like a perfect illusion of seamlessness when it’s actually not seamless at all. Basically every new board that you load has to be rendered by the system. It has to be, you know, loaded into memory. And we, there’s some tricks we’re using there, but it’s uh it’s certainly not, not easy to keep up that illusion all the time.

00:25:43 - Speaker 2: Mark, I think you’ve made the comparison to video game development at various points, and this does actually remind me of, you mentioned loading screens. Video games with big continuous worlds, which is, I think, pretty common in today’s um kind of open world games.

This actually has a similar technical challenge that you don’t want to interrupt the players' movement and and give them a now loading screen that really kind of is a kink in the experience or or removes that illusion of being one continuous world.

But in fact, when you have this huge world that can’t possibly fit in memory, uh, you do need some way to handle that. I think there are similar, I think, I feel like a lot of the tricks that we’ve landed on to make this work, uh, for Muse actually would be quite right at home in the video game world.

00:26:29 - Speaker 3: Absolutely. Circling back to gestures, then perhaps we can talk about gesture spaces. So this is the idea of the A kind of set of gestures that is possible in the app and the the actions that you can do with that, we found that to be a really interesting challenge with Muse. the set of things that you could do with your hands or the pencil and the actions in the app that that maps to. And one of the reasons this is so challenging is it tends to be much more constrained than a desktop app. So on desktop, you have the mouse, you have the two buttons, you have the mouse scroll wheel, you have the whole keyboard, and then you typically have the menus and the pattern of a right click menu or a press and hold menu. Whereas on mobile, you know, traditionally you just have like basically one finger, and with uhm we’re trying to extend it to, you have 10 fingers and uh the pencil, but it’s still quite limited. You don’t have, for example, a menu where you can just add a bunch of stuff as you have more functionality in the app. So whenever you add a new feature, you need to find a way to invoke that with your hands, which isn’t easy because there’s a quite limited uh space um to draw from. So, so that that that results in a few concrete challenges.

One is, you need to come up with particular gestures. So an example for us is we need to find a way to um pick the color of ink you’re using, and for that we have the swipe from the edge of the screen. Gesture, which is I think pretty novel, and I guess that’s something that’s built into iOS.

00:27:55 - Speaker 1: Yeah, the, the swipe from the screen with your pencil gesture is actually quite a harrowing thing that’s still an ongoing problem for us.

So IOS actually does have a deliberate, I think it’s called UIH swipe gesturerer or something like that. So that there’s a way that you can attach a gesture listener to only swipes that happen from outside the device into the screen and I think iOS uses that for all of their system-wide thing like you can summon the dock from the bottom or you can.

Gate back by swiping from the left.

Um, and I was when I was initially implementing this menu, I was like, oh great, we’ll just use swipe gesture recognizer like that and we make it fire only for the pencil because notably those gestures, the system-wide gestures and IOS don’t work with the pencil.

You can’t summon the dog with the pencil or you can’t pull in the Control center with your pencil from the top. So I thought they would just be up for grabs, those gestures, but unfortunately that edge swipe gesture recognizer does not work with the pencil.

00:29:05 - Speaker 3: Surprisingly often, we’ve run into the sort of edge of the map on the iOS APIs uh because we’re doing things that are quite unusual in Muse.

00:29:16 - Speaker 2: And what do you do for testing and debugging this stuff? You, you’ve talked about the simulator, but that’s pretty poor for uh for this kind of thing. There’s obviously you have the physical device there. How do you test this stuff?

00:29:28 - Speaker 1: Yeah, so testing gestures um is, is, you know, obviously a little bit harder than than other debugging in some cases because you can’t just put a breakpoint in the middle of a gesture to see exactly what’s going on.

I mean you can but then you basically, when you then focus your attention back to the screen. And to try to see what’s going on, you have to lift your finger from the device that you were just testing on and then once you, once you continue execution, that that gesture will have ended.

So what I usually do is I put a lot of logs. So if I’m trying to disambiguate some gestures and often it’s like very finicky, like which one fires first and then what what like what finger went down first and was it on a card or on the board um and then I just go manually through those locks and try to try to figure out um the the sequence in which things are happening and where I can where I can intervene and uh tweak things.

Another thing that we uh that we use internally for debugging is that we have a little system that actually visualizes your touches on the screen that often helps to kind of explain to other people in the team, look, when I’m doing this gesture, um, something happens that shouldn’t be happening and there’s a way that we can activate um basically little blue circles showing up around where your fingers are and little um and different colored circle for your pencil. And then it’s really easy to kind of record a video or do a screen share where you show your um your peers what exactly you’re trying to do and what where where exactly your fingers are when certain things happen. So that’s, that’s kind of been a useful team, team. I would say.

00:31:11 - Speaker 2: Yeah, by sharing a video, you’re showing a not only a reproducible case, but then you can even kind of slow.

I find it useful sometimes to slow down the video or pause it to to figure out exactly what’s happening there, where and can, can make it more reproducible.

I’m sure we can get more sophisticated with those tools over time, but yeah, those colored circles have proved, combined with the screen recordings have proved, uh, remarkably useful for us in testing.

Uh, you mentioned earlier the uh the operating system gestures like summoning a doc from the edge, uh, talking about the stylus from the edge reminded me of the, uh, take a screenshot by going uh stylus in from the um one of the corners. I wonder what happens in the case when we end up colliding with OS system gestures. For example, we had some some capabilities in the app when I was more in kind of beta prototype phase that did involve dragging up from below, and those would get in the often interfered or or had a bad interaction with the. Uh, with the OS summit a dock, and notably, I think when we started working on the app, it was before the dock had been introduced, so that gesture to summon the dock didn’t exist, but later it became totally foundational. Hall, now iPads and iPhones don’t have home buttons to take you. Home button you swipe up from the bottom. But then we basically had a swipe from the edge of the screen and specifically the bottom gesture, and that was colliding with that in a pretty bad way. And we, we basically had to make a make a change there.

00:32:43 - Speaker 1: Yeah, well, I think the general rule is here um that the operating systems. So this this has been a trend over the past couple of years where where um DOS is actually has been taking over more and more. gestures, particularly around the edges of the screen, and in many cases for apps that means they’ll just have to change their gesture system.

There is a way that you can, you can basically override these gestures once and tell them that you know, I actually want to get this swipe first. So this is something that we we tried out when we had the the thing that you were able to pull in from the bottom. Um, you can tell the system to defer its system gesture to let your Uh, your own app, get that gesture first and what that does is that it, um, it makes your app execute the gesture, but then also brings up this like little arrow thing. Um, and if the user actually if the user’s intent was actually to pull up the dock, then they have to basically do the gesture again on the arrow and then pull up the dog. But that makes a lot of users very angry and I think rightfully so if you, yeah, if you, if you learn how to how to use your device and you kind of have muscle memory about around certain things and certainly Uh, such fundamental things as, you know, switching an app and pulling up the dock, then you don’t really want apps to interfere with that or kind of override it with their own default behavior. So you basically just have to cave in.

00:34:11 - Speaker 3: Yeah, and there’s a risk here of major gesture space reflow.

So I mentioned how we’re using basically all the gesture space that we know of for the app. It’s all packed with our different features and functionality, and so, if the OS takes away just one. You could have this musical chair situation where one of the features of the app doesn’t have anywhere to sit. And so then you need to, you know, figure something totally different out for your gesture space, you know, open up a whole new room, for example. Um and we’ve gone through that a few times where we were just short of the degrees of freedom that we needed, so we need to basically rethink how all of our gestures work.

00:34:47 - Speaker 2: Just recently, a friend of mine was learning the terminal, the Unix terminal, and in the process of doing this, this was on a Windows computer, I was surprised to learn that the copy command does not work, so they’re used to pressing control C.

But it turns out the Control C has a long history well predating the existence of copy paste buffers to break out of a program in the Unix command line. So typically these terminals on the Linux and and uh uh Windows will basically take over that control C because they need it for the sort of for the historical compatibility.

And in fact, users are quite used to that as a way to break out of a program.

But then if you’re expecting that that’s a copy, which is an absolutely crucial uh capability that people rely on all the time, uh, it’s quite confusing, distracting, annoying that that gets blocked and you need to use essentially another key command or another way of doing copy.

So that sort of thing has existed since time immemorial, but maybe iOS and the iPad in particular, such of a quickly evolving. Uh, new space. And so we’re trying to push the frontier, but then the operating system maker is also trying to push the frontier and then simultaneously, of course, as we explore the space, the likelihood of collisions is reasonably high.

00:36:05 - Speaker 1: And I think we’re already trying to do a lot of things differently, um, but we can’t possibly overload the user with too many weird things. So in some cases just doing the standard thing is probably also a good idea.

00:36:17 - Speaker 2: We’re definitely pushing right up against the ceiling of a number of weird things for the for the user to learn.

00:36:23 - Speaker 3: So, Julia, looking forward, what are you excited to try in the gesture system?

00:36:28 - Speaker 1: So I’m actually still kind of flirting with this idea of um something that you you referenced earlier um talking about this uh this new icon talk by Shannon Hughes where she introduced this idea of actually building an entire state machine that manages all the gestures in your app.

So that way you you have one centralized place that always knows about what’s going on and what’s possible to go from one state to the next. So if you One if you set down a finger on the screen.

From there it might be possible to go into a pinch or into a drag cart and the state machine would handle all of the valid states and state transitions and that way you you have a more deterministic and consistent approach to things and you don’t have to. scatter different different um dependencies across different components of your app that I have to check, am I currently dragging a card? Do I need to cancel that drag in order to start the scroll.

Um, so I think a bit more centralized approach there could actually be interesting, but it would also be a lot of work, um, so currently we haven’t. We haven’t made that a focus yet because what we have is working pretty well, but if, if, if we ever get bored or if this ever becomes a huge issue, I think that would be something that I would be excited to try.

00:37:50 - Speaker 2: While there’s way more to talk about here since we’ve invested a huge amount of time into uh this gesture system and certainly will going forward, uh, perhaps we’ll leave it there. So if any of our listeners out there have feedback, feel free to reach out to us at UAHQ on Twitter or hello at musesApp.com by email. Love to hear your comments and ideas for future episodes. You very glad that you’re uh working hard to make it possible for Muse users to have this fluid and powerful interface for interacting with their ideas.

00:38:25 - Speaker 1: Thanks. Yeah, it’s been uh obviously a lot of fighting but also a lot of fun.

Discuss this episode in the Muse community

Gesture programming for the iPad

Episode notes

Transcript

Metamuse is a podcast about tools for thought, product design & how to have good ideas.