Wednesday, February 22, 2012

Thinking Process

While working on the implementation of the game, various questions and challenges arise. The most important ones have to do with the jitter that is introduced due to the Kinect sensor. I have set thresholds for most of the movement and rotating monitoring but that diminishes the naturaleness of the motion. More specifically the problem arises when:
a) The remote player side walks, controlling the position of his window in his opponent's virtual world.
b) The remote player stretches or pulls his hands closer, controlling the size of the window in his opponent's virtual world.
c) The local player leans left or right to look into the still window of her opponent.

After making lots of tests I have manage to achieve a balance which I believe is acceptable both in terms of minimizing jitter, as well as effecting a relatively smooth motion (see video below).


Another issue that came up during "peeping" into your opponent's world is if the rotation is reset when the opponent is unfrozen. This is the default action at the moment, but I am not sure it is the optimal. The rotation currently freezes to where you left off when your opponent resumes motion and resets to its initial value when you freeze her again. However, it could potentially be more interesting to have all the bonuses scattered around each player, which would mean that you would have to freeze your opponent repeatedly and move around almost 360 degrees to collect them all! And just to add to the confusion, maybe while he's frozen he could -in some way- move the bonuses around to make your life more difficult while you were trying to grab them! I guess I have to reach this point and try out these ideas before committing to one or another.
These implementation, if they make it to the final version, bring another idea worth considering: would it be appropriate to include a top-down view of your opponent's world and an indication of the part you're currently looking? This would make you get a better understanding of where you are looking at, and could also function as a guide for the bonuses' locations. The way I envision such a map is depicted in the following image. Leaning your body and peeping through the window (as in the video above) would cause the highlighted, torch-like beam to rotate accordingly.


I was also thinking about shouting and how you could control the direction of the ball (i.e., the bullet). The angle of your body could be the obvious way to do this, but on second thought maybe such an option would deter the players from moving while trying to aim. Of course they would have to move in order to avoid enemy fire, but still another incentive would be to have the opponent directly opposite and fire at them. Again, this would have to be tested with actual players, but my intuition is that the more you involve the players in energetic motion, the more enjoyable the experience will be. The prospective of different fire power (maybe velocity and/or size) could be controlled from different body movements (e.g. hand-gesture shoots large but slow balls, while kicking shoots small and fast ones!)

Another concern has to do with the communication between the two players. My initial thought is network commands sent between the two machines, but I am not sure if this is the right decision. Another idea is having both players use the same machine with their windows projected on two displays (using a dual graphic card). Although this solution would relieve the application from having to handle network messages, it imposes another burden of having to drive two Kinects at the same time and propagate their sensor data to the application. Testing would be the decisive factor for this issue, too.

Wednesday, February 15, 2012

Visceral Intense Gaming of Opposing Realms (VIGOR*)

The suggested design is comprised of two individuals playing against each other on two different displays. Each player will inhabit a virtual world in which there is going to be a view to his opponent's world. This view will be controlled by the body gestures of the other player, who will be able to expand or collapse the view in order to assist or hinder the opponent in accomplishing his goals. Both players will be involved in a quest-style game, in which they will have collect items through the "keyhole" view of their opponent's world.
Innovation and novelty in our design stems from the fact that gameplay is tightly related to your oppponent's body and actions, also satisfying the four components of the Wheel of Joy, which contributes to the "coolness" factor, as defined by Karen Holtzblatt[1]. Hence, our game leverages a sense of accomplishment commonly felt in competitive games; the social character of the game and the fact that you play through someone else's body will definitely foster intense connection; gamers will undoubtedly find the whole experience as providing them with the identity of an avid gamer; and the sensation deriving from the physical movements and actions with immediate impact on both environments will hopefully be high.

[1] Karen Holtzblatt. What makes things cool? Intentional design for innovation. In interactions, pp. 40-47, November-December 2011.

1. Physical or mental strength, energy, or force. 2. Strong feeling; enthusiasm or intensity [Definitions from the online "Free Dictionary" by Farlex].

Testing and more brainstorming

I started working on my idea and experimenting with the Kinect and how I could make a "hole" of a shillouette. I did not manage to do exactly this, but I don't even know if it would be meaningful to try and interact through such an irregular shape, as someone's figure. Instead, I created a rectangle "window" which was adjustable depending on the user's posture. More specifically, I made the width relative to the distance between the user's hands and the height between his left foot and the head. I experimented with different scale factors and managed to come up with an acceptable size. I then made the window move left and right according to the position of the player in space; adding movement in the Z axis, although I don't know if that is going to be useful in my case. Finally, I added movement of the window view according to body movement, so that it really seems that moving left-right does not only shift the window on a fixed location in the second world. The result of these experimentations is illustrated in this brief video.

 

While developing this work, I was brainstorimg on possible gaming scenarios. The most prominent ideas and questions that came out are presented below. Initially, I figured that trying to play a game through someone's moving body (or body-controlled window) would be almost impossible. That's why I assumed that the alteration between two different game modes should take place. Initially, the players will try to shoot each other with some kind of non-lethal weapon (e.g., plastic balls). When a (window of a) player gets hit the window freezes and the "shooter" can approach and have a closer, more thorough view through the window. If the view allows he can aim and pickup bonus items that he has to collect to finish the game. He could also shift his body left or right to change the view angle within the window; at this point his movement is not propagated to the other player's screen (i.e.,the view controlled by his body is idle). After a predefined time, the normal game mode resumes and they start aiming each other again.

Some questions that arise from this scenario are the following:
  • How easy or difficult would it be to shoot while moving to avoid enemy fire?
  • Should there be any form of navigation in the environment, or just shifting motions?
  • What degree of engagement can such a repetitive process involve?
  • What other ideas/extras can make the game more interesting and novel?
Thinking about the last two questions, made me come up with some additional gaming features. The first one has to do with the time a player remains "frozen" after being hit. Instead of having a strictly predefined time period, that player could perform a specific repetative -funny- gesture (e.g., move his right leg up and down 20 times) to accelerate the freezing time! This gesture would be different every time and would provide an unpredictable factor for the player "looting" his environment. Another add-on could be providing different environments, like game levels, in which the players are playing. The complexity of those worlds (i.e., having obstacles, or moving enemies) in addition to the adaptable distance between the players, could greatly leverage playability. Also, shooting could be performed with a body gesture too instead of a different control device. If this gesture was, for example, stretching the hands wide open it would have the added benefit of forcing players to enlarge their window in their opponent's environement.
Questions that came up after thinking about these ideas are:
  • Do the levels provide enough motivation to keep playing?
  • Does a seperate gaming device (e.g., a Wii remote) provide an added benefit or just diminishes the naturalness of the interaction?
  • Is the variation offered by the repetitive "unfreezing" actions more of a burden to the player's fatigue, than a significant aid to game play?
Some other issues that I have been thinking about have to do with the setup of the physical environment, technical issues, and how all these affect gameplay. More specifically, I am concerned about the following:
  • Should the players be physically located in the same space?
  • If yes, should they use the same Kinect device and play on a PC splitting its output into two monitors? (hmm... probably not - collision danger!)
  • What happens if they get out of the camera's field of view? (could they "hide" and shoot from there or should there be physical obstacles preventing this?)
Finally, some final concerns have to do with the innovation and novelty part of the design:
  • Is the suggested game just another shoot 'em up with an insignifficant new feature?
  • Does the shoot-freeze-loot sequence constitute a compromise to the true potential of the game?
  • Would the design make more sense for a collaborative game, or will that diminish its novelty?

Monday, February 6, 2012

Ideation goes on...

I wasn't very satisfied with the "incarnations" of the physical games like Charades and Taboo using 3DI technology, so I started looking towards a different path. I was still interested in games and consequently tried to come up of ways to enhance conventional gameplay. After some thought, I imagined it would be fun to be able to control your point of view (in first-person games) or your avatar's movement (in third-person view) using body gestures. There are so many games that could be used as testbeds, especially the ones you control a virtual character and are involved ins ome kind of quest (e.g. Warcraft, Diablo, etc.). During my research I "stumbled" on FAAST (Flexible Action and Articulated Skeleton Toolkit), developed in the Institute for Creative Technologies at the University of South California. Their toolkit seems to do pretty well what I was thinking of (an example video is included below; more videos on their website), plus provide an easy way to manipulate skeleton points in Vizard using the VRPN software. Eventually, every cloud has a sliver lining and this implementation relieves me from depending on Microsoft's drivers to use Kinect, a fact that restricts implementation on computers running under Windows 7.


What about some learning?

I then decided to slightly change route and think of something with more educational dimensions, just in case this would prove to be a better source of inspiration. My thoughts were focused on how to make learning more fun and engaging by involving the whole body (besides the mind) and not just the hands using a conventional interaction device like a mouse or even a PS/Wii remote. And the idea emerged! Two users would be able to participate in the learning experience (I am keeping the collaborative aspect of my design): a teacher/parent and a student/child. They would both have to stand in front of a large display, although they could be either co-located or in remote spaces. The teaching/learning experience would involve the teacher throwing things on the student and the student trying to grab them!

Let's assume, for the shake of providing a practical example, that a mother was trying to teach her 5 year-old son the meaning of "skyscraper". She would have pictures or physical models of various artifacts, like buildings, cars, furniture, toys that she would have to move in front of a camera, and the camera would recognize each item and create a virtual object on the fly. Then, by making the gesture of throwing the physical artifact the virtual object would be catapulted towards her son's avatar in the virtual world; her child would have to either move and avoid the item if it didn't describe the projected word (e.g. a car model) or would have to try and reach the item if this was the right one (e.g. a miniature skyscraper). This could also involve the construction of combined words, so in our example the mother could throw -among other miscellaneous items- the picture of a sky and a scraper (see sketch below).


Of course there is a major challenge in this implementation, and it has to do with how to transform a physical object into a digital one. Maybe the simplest way would be to capture an image of the object and apply it as a texture to a generic shape, but the aesthetic result wouldn't be much appealing. Another idea is to have those items already rendered as models in the virtual world, and just use computer vision to compare the object in front of the camera with the virtual model. However this implementation is limited to the number of items that can be used in the educational activities. Finally, I could use the KinectFusion algorithm for 3D local mapping of the objects, but this would make the whole process much more complicated and less of a fun experience for the teacher/parent; additionally, it wouldn't probably work in the case of pictures.

Eureka?

One of the ideas that were spinning in my head (mainly in my dreams... or should I say nightmares?) but I could not put into context, was to use the human silhouette as a portal to another virtual world! I eventually decided to give this idea some more thought while I was awake, since it was too recurring to be ignored. Many great game ideas sparked from this initial one (one of them involving Pong!), but I will describe the prevailing one which derived from distilling those former ones. Although I don't have a partner in this project, thankfully my wife participated willingly in my loud brainstorming sessions during our last night dinner for her birthday! And the innovative design idea which came up and I initially baptize visceral gaming is the following...

There are two players playing in front of different screens; they could be remotely located or for greater social interaction, and eventually enjoyment, they could stand side by side. The world view of the first player will be augmented by a portal into a second world view provided through the figure of the second player and vice versa! The players would have to participate in a quest (e.g. find all the pieces to complete the image of an ancient temple) and could play either collaboratively or competitively. The challenge, and novelty, in the game stems from the fact that they can interact meaningfully only through the portal that is created from their partner's (or opponent's) body figure. This means that they will have to "plant" artifacts in the virtual world that they inhabit, which will entice the other user to approach (i.e., they would be essential for their quest) in order to "enter" or have a view of the virtual world (inhabited by the other person) in which they can actually interact/play to complete their own quest. Sounds complicated, eh?


I believe that the above sketch which illustrates this initial idea might clarify things. A lot of refinement should be made in order to come up with a meaningful game scenario that would allow both players to be immersed in the experience and truly enjoy it. The core challenge in this design is how to implement the portal that would provide an adequate view through someone's silhouette, for the player to interact with this environment-through-keyhole. Maybe the figure should be reshaped into a regular shape (e.g., a circle with a diagonal equal to the distance between the other player's head and toes) or be adapted depending on the other player's body posture. Another concern has to do with the actual interaction with both virtual worlds; should body gesturing be used throughout, or should I use different devices for different world views (e.g., body gestures for the inhabited world and Wii remote for the visceral world)? 

No matter what, I truly believe that this idea has the potential to evolve into a challenging game experience and, eventually, into an impressive demo video with a high coolness* factor!

* The quest-style game definitely contributes to a sense of accomplishment, and even more if you manage to complete the game before your opponent! The social character of the game, especially if the players are co-located, and the fact that you play through some else's body will definitely foster intense connection. Gamers will undoubtedly find the whole experience as providing them with identity: "I am an avid gamer!", and the sensation deriving from the physical movements and actions with immediate impact on both environments will hopefully be high.

Wednesday, February 1, 2012

Brainstorming for the next big idea...

After the physical pong proved to be far too popular than I expected, I started thinking about other 'old and beloved' games that could be aptly ported to the  digital world of 3D interaction. Probably because I have invested too much time thinking and refining on the pong idea, and adding new elements to make it more engaging, I couldn't easily set my mind free to explore other similar implementations. Instead, I assumed it maybe more fruitful to think in reverse, that is come up with a conventional game that is played without technology and spice it up using 3D interaction. 

Additionally, I would have liked to be some kind of game that not everybody can play unless they have some kind of abilities. My mind immediately focused on people with disabilities, and I tried to think of a way to enable this group of people to perform actions and participate in collaborative gameplay, that wouldn't be able to do without the use of technology. Since Microsoft Kinect (which has drawn my attention at the moment) is eminently a device that allows rich ways of expression, I tried to think of a specific group that is at a disadvantage in this respect. The first one that came to mind were people unable to speak, since they are missing one of the most important human attributes for expression: speech.

However, I had to come up with a game that demanded expressive players; the one that immediately came to mind was Charades. Since this is a typical party game that demands, by default, body gestures I initially thought it would be a good match for a game enhanced and facilitated using Kinect. Nonetheless, in Charades the player acting as a descriptor for his team has to perform body gestures anyhow, meaning that the Kinect would have to be used for her teammates who would be trying to find the described word/phrase. I wanted, though, a game that would involve Kinect-enhanced gameplay for all players, and also would be equally enjoyable for non-mute people.

My second choice was Taboo, which is kind of the opposite of Charades. For quite a few reasons I considered it a far better application for a gesture-based alternative of a conventional game. Firstly, the players have to describe a word using just speech (no gestures allowed), which was the incentive for choosing a game that mute people cannot normally play. Secondly, it is more constrained as a game since the player has to describe one word and has a set of words she has to avoid. Lastly, all players could play using body gestures at the same time, even though this might be too challenging for the implementation.

One of the challenges of implementing Taboo using body gestures would be what kind of gestures to make for the system to understand the corresponding word. Since there could not be such a universal body languange (besides the signal language that mute people use, but wouldn't want to increase complexity that much), I thought of providing a way to the player to train the system, intead of hard-coding the moves. This way the game would be unique for each group of people, providing them with a means to customize it and associate the gestures with words, in the most optimum way that makes sense to them. Looking for similar implementation I came across the Kinectic Space (video below) which allows users to record up to 9 individual movements.

Since that was really close to what I was thinking, I download and tried the tool; I even thought that I could use it to develop my design idea. However, its limitations became apparent from the beginning: nine moves are barely enough for a decent Taboo implementation, it uses only the hands and no other part of the body, and it seemed it could not perform accurate matching of the moves every time. Nonetheless, it is a ready-made tool which releaves me from having to implement the recording, and can also record moves (1 second long) instead of still body gestures (although, this is not my priority in the game).

The idea is pretty simple: the player describing the word is performing the (trained) moves which the Kinect detects and displays the corresponding word on screen. The players who have to find the word are throwing out words by means of body gestures, too; when the right word is performed the game gives one point to the team and proceeds to the next word. If the descriptor performs a gesture which corresponds to a Taboo word, a sound is heard, the opposing team gets a point and the next word to be played appears.

Through some subsequent discussions, there was some debate that players' gestures could be directly recognized by the co-players (assuming they accurately represent a word or meaning), and eventually there would be no need for a Kinect or a screen. A counter-argument would be that it is not possible to have a meaningful gesture for every word that can be played (e.g., you can mimic a monkey easily, but not a table with a single gesture). Also each player could train the system to match different gestures with the same word, and login during gameplay in order to load her personal set of gestures. Having said all this, I understand that there is a huge cognitive load imposed on the part of each player who has to remember all the gestures that she did during training and reproduce them accurately on the 'heat of the game'! This burden, along with the limited number of words that one can actually perform and remeber, are the biggest hurdles with this Taboo implementation.

Consequentlty, I have to either address these problems with a clever work-around or come up with an alternative design idea.