In this paper, the problem of capturing human motion in a natural environment is discussed from the perspective of needs, significance, scenarios, and technical challenges. The technologies that can be potentially used to capture human motion and activity in a natural environment are discussed, which include electromagnetic sensors, LED lights, inertial measurement units, range sensors, and computer vision-based markerless motion capture technology.
Two markerless motion capture methods for capturing human motion from video imagery are investigated and implemented in this paper. The first method uses a silhouette shape descriptor to describe silhouette shape and maps the silhouette shape descriptor (input vector) to joint angles (output vector) through a mapping matrix which is determined using relevance vector machine. The second method performs pose estimation by fitting a 3D human model to the silhouette through an iterative optimization. By minimizing the distance between the silhouette and the template skeleton-surface model that is embedded inside the silhouette, joint angles are estimated and thus pose is identified. The silhouettes extracted from human animation data are used for training the methods. The initial results of the two methods are presented and analyzed.