WO2010082933A1 - Position estimation refinement - Google Patents

Position estimation refinement Download PDF

Info

Publication number
WO2010082933A1
WO2010082933A1 PCT/US2009/031335 US2009031335W WO2010082933A1 WO 2010082933 A1 WO2010082933 A1 WO 2010082933A1 US 2009031335 W US2009031335 W US 2009031335W WO 2010082933 A1 WO2010082933 A1 WO 2010082933A1
Authority
WO
WIPO (PCT)
Prior art keywords
markers
virtual
camera
photographic image
physical
Prior art date
Application number
PCT/US2009/031335
Other languages
French (fr)
Inventor
Chad M. Nelson
Phil Gorrow
David Montgomery
Original Assignee
World Golf Tour, Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by World Golf Tour, Inc. filed Critical World Golf Tour, Inc.
Priority to PCT/US2009/031335 priority Critical patent/WO2010082933A1/en
Publication of WO2010082933A1 publication Critical patent/WO2010082933A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T19/00Manipulating 3D models or images for computer graphics
    • G06T19/006Mixed reality
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • G06T7/73Determining position or orientation of objects or cameras using feature-based methods
    • G06T7/75Determining position or orientation of objects or cameras using feature-based methods involving models
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30204Marker
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30244Camera pose

Definitions

  • Electronic games and other types of simulations recreate real world environments such as baseball diamonds, race tracks, and golf courses through three-dimensional (3D) computer generated graphics.
  • graphics can typically create unnatural visual artifacts such as repeating patterns which detract from the intended realism of the imagery.
  • Some electronic games may use a photograph of an actual location as a background, such as mountains, with computer generated graphics rendered in the foreground.
  • graphics rendered in the foreground can still detract from the intended realism of the imagery.
  • a 3D virtual environment, or a three-dimensional computer generated representation of a physical environment that can portray, for example, topologies, landmarks, and positions thereof, can be used in simulations.
  • Superimposing photographs on the 3D virtual environment can improve the intended realism of the simulation.
  • Positions of cameras used to capture the photographs can be used to perform the superimposing. Ideally, the positions of the cameras should be determined as accurately as possible.
  • photographic images e.g., two dimensional (2D) photographs
  • 3D computer generated graphics can be superimposed on 3D computer generated graphics.
  • one aspect of the subject matter described in this specification can be embodied in methods that include the actions of receiving an inaccurate three-dimensional (3D) position of a physical camera positioned above a surface in a physical environment corresponding to an actual 3D position of the physical camera where the physical camera captured a photographic image of the physical environment from the actual 3D position and where the photographic image includes a plurality of markers on or above the surface, each marker having a known 3D position on or above the surface; basing an initial 3D position of a virtual camera in a 3D virtual environment on the inaccurate 3D position of the physical camera where the 3D virtual environment includes a 3D representation of the surface and a representation of the markers on or above the surface according to their known 3D positions, and where the virtual camera' s field of view is of a portion of the representation of the surface; correlating one or more markers in the photographic image with one
  • the photographic image is a two-dimensional (2D) photographic image.
  • the 3D position of the physical camera is defined by at least one of position data and attitude data.
  • the one or more markers in the photographic image are visual markers.
  • the one or more markers in the photographic image are virtual markers that are generated after the photographic image is captured.
  • the method further includes superimposing the photographic image over the portion of the 3D virtual environment in the field of view of the virtual camera.
  • the method further includes receiving input matching one or more of the one or more markers in the photographic image to one or more of the one or more markers in the virtual camera's field of view.
  • the correlating includes: minimizing the disparity based on the matching, producing a minimized disparity; and generating a reverse projection of an adjusted 3D position of the virtual camera based on the minimized disparity.
  • the method further includes generating a grid of the 3D virtual environment that includes the virtual camera configured according to the reverse projection.
  • the method further includes calculating the field of view of the virtual camera; and displaying a representation of the field of view on the grid.
  • Correlating markers in photographic images with markers in 3D virtual environments increases an accuracy of image alignment (e.g., proper position, orientation, and scale).
  • Correlating the markers allows a 3D position of a virtual camera to be calibrated, thereby increasing an accuracy of the 3D position of the virtual camera.
  • a realism of real world environment re-creations is increased and a user's experience is enhanced by providing the user a simulation that gives the user an impression that the 3D virtual environment was generated using 3D photography.
  • a quality of shot sequences based on portions of a 3D virtual environment in fields of view (FOV) of one or more virtual cameras in the 3D virtual environment, can be improved.
  • FOV fields of view
  • FIG. 1 illustrates an example course grid.
  • FIG. 2A is a photographic image of an example view of a physical environment.
  • FIG. 2B is an image of an example view of a 3D virtual environment.
  • FIG. 3 illustrates the photograph of FIG. 2A superimposed on the image of FIG. 2B.
  • FIG. 4 is a flow chart of an example process for aligning images.
  • FIG. 5A is a schematic of an example system for aligning images.
  • FIG. 5B is a schematic of an example alignment tool.
  • Various implementations recreate a 3D virtual environment (e.g., a computer simulated golf course) utilizing digital representations of actual photographs of a corresponding physical environment combined with computer generated 2D and 3D topographies in a virtual environment.
  • Portions of the virtual environment can be associated with actual photographs captured by one or more physical cameras that correspond to one or more virtual cameras positioned in the virtual environment. Adjusting positions of the one or more virtual cameras can be used to align digital representations of the actual photographs with computer generated 2D and 3D topographies.
  • FIG. 1 illustrates an example course grid 100.
  • the course grid 100 can be a grid that divides a hole of a golf course (e.g., a first hole, a second hole, an eighteenth whole), for example, into a plurality of cells (e.g., a cell 110).
  • Each cell can define a physical area of the course that can be photographed for use, for example, in the simulation of a virtual golf course.
  • Each cell can have one or more photographs associated with the cell.
  • the one or more photographs can include views of different angles of the portion of the golf course (e.g., views around 360 degrees from any point above or on the cell) corresponding to the area of the cell.
  • Each photograph associated with the cell can be associated with a virtual camera that corresponds to a physical camera that captured the photograph.
  • the cell 110 includes a virtual camera 120.
  • the virtual camera 120 can be disposed on the course grid 100 according to an associated 3D position.
  • the 3D position can include a location of the virtual camera 120 (e.g., x-, y-, and z- coordinates) and an orientation (e.g., attitude, pitch) of the virtual camera 120.
  • the 3D position can be determined from a measured 3D position of a corresponding physical camera in the real world environment.
  • the measured 3D position of the corresponding physical camera can be determined using global positioning system (GPS) data provided by GPS devices.
  • GPS global positioning system
  • the physical cameras can include GPS devices that capture GPS data for the position and orientation of the physical camera used to capture each photograph.
  • a GPS device that is external to the physical camera can be used to capture the GPS data.
  • the measured 3D position of a corresponding physical camera can be inaccurate.
  • the GPS device may provide GPS data that indicates a measured 3D position of the physical camera that is different from the actual 3D position of the physical camera.
  • the 3D position determined from the GPS data can be used to dispose a corresponding virtual camera 120 at the 3D position in the cell 1 10.
  • the actual 3D location of the physical camera 130 would indicate that the corresponding virtual camera should have a different 3D position in the cell 110.
  • the virtual camera 120 should have the same 3D position as the actual 3D location of the physical camera 130.
  • the virtual camera 120 Because the 3D position of the virtual camera 120 is based on an inaccurate measured 3D position of the physical camera 130, the virtual camera 120 also has a field of view 125 that is different from the field of view 135 of the physical camera 130. In other words, a first field of view associated with the virtual camera 120 and a second field of view associated with the physical camera 130 would portray different portions of a representation of the surface of the golf course. The first field of view associated with the virtual camera portrays what a camera would see at the measured 3D position, while the second field of view associated with the physical camera portrays what a camera would see at another 3D position (because the physical camera is not actually disposed at the inaccurate measured 3D position).
  • Accurately superimposing a photographic image onto a 3D virtual environment, and thereby determining an accurate location of the physical camera, can be facilitated by aligning markers in the photographic image with corresponding markers in the 3D virtual environment.
  • the physical environment can include markers (e.g., stationary objects such as sprinkler heads, stones, and trees) on or above the surface of the environment. Markers can also be introduced to the environment, on or above the environment. For example, colored cones or washers can be disposed at certain positions in the physical environment. In some implementations, the markers can be visible markers (e.g., the colored cones, or objects on or above the golf course such as sprinkler heads). In some implementations, the markers can be virtual markers. For example, a virtual marker can be generated by using radar to determine a 3D position, on or above the surface of the physical environment, that can be represented by a virtual marker, and metadata can be added to photograph to indicate the virtual marker in a digital representation of the photograph.
  • markers e.g., stationary objects such as sprinkler heads, stones, and trees
  • markers can also be introduced to the environment, on or above the environment. For example, colored cones or washers can be disposed at certain positions in the physical environment.
  • the markers can be visible markers (e.g.,
  • a field of view can be defined as a portion of an observable environment that is seen at any given moment by a camera, or an angle extent thereof.
  • the photographic image represents the FOV of the physical camera 130, which is, in reality, different from the FOV of the virtual camera 120. Therefore, the photographic image may not be aligned properly with the virtual environment (e.g., accurately superimposed on the 3D topography).
  • the FOV 135 of the physical camera 130 includes a first object 150 (e.g., a marker represented by a circle) and a second object 160 (e.g., a marker represented by a triangle). Because of an inaccurate measured 3D position used to generate the virtual camera 120, the FOV 125 of the virtual camera 125 does not include the second object 160. Mapping a photographic image captured by the physical camera 130 using the inaccurate 3D position used to generate the virtual camera 120 would produce an inaccurate alignment (e.g., superimposition) of the photographic image on the virtual environment.
  • 3D positions can be determined for each marker.
  • measured 3D positions of the markers can be more accurate (e.g., a disparity of less than 3 inches from the actual location of the marker in the physical environment) than measured 3D positions of the physical cameras.
  • high-precision GPS devices can be used to determine the 3D positions of the markers.
  • FIG. 2A is a photographic image 200 of an example view of a physical environment.
  • the photographic image 200 shows the field of view 135 of the physical camera 130.
  • the photographic image 200 includes the markers 150 and 160 in the field of view 135.
  • FIG. 2B is an image 250 of an example view of a 3D virtual environment.
  • the view of the 3D virtual environment corresponds to the field of view 125 of the virtual camera 120.
  • the image 250 does not include the marker 160.
  • the photographic image 200 would show the same view as in the image 250.
  • the FOV 135 of the physical camera 130 is not identical to the FOV 125 of the virtual camera 120.
  • FIG. 3 illustrates the photograph of FIG. 2A superimposed on the image of FIG. 2B.
  • FIG. 3 illustrates a simple superimposition, where an alignment includes aligning the edges of the images (e.g., the edges of the FOVs).
  • the photographic image 200 is superimposed on the portion of the 3D virtual environment represented by the image 250, the terrain illustrated by photographic image 200 would not accurately align with the topography of the portion of the 3D virtual environment represented by the image 250.
  • a first depiction of the first marker 150 (as represented by the circle with a solid boundary) in the photographic image would not align properly with a second depiction of the first marker 150 (as represented by the circle with a dotted boundary).
  • depictions of the terrain e.g.
  • a hill also would not align properly. Note that a second depiction of the second marker 160 is not visible. If the second marker 160 were an actual object (e.g., a bush) to be included in the golf course, there would not be a photographic image, or portion of a photographic image, to superimpose on a corresponding object in the 3D virtual environment.
  • an actual object e.g., a bush
  • Objects and/or markers can be aligned to increase the accuracy of image alignment.
  • the markers can be aligned by correlating markers that should correspond to each other. Markers that correspond to each other can be identified and correlated with each other.
  • an alignment tool can be used to facilitate selection of corresponding markers.
  • a user can use the alignment tool to select (e.g., click on) the first marker 150 in the photographic image 200 and to select the first marker 150 in the image 250.
  • the user can use a secondary action to correlate (e.g., match) the two selections.
  • the user can press one or more keys on a keyboard (e.g., CTRL + C).
  • a user can use the alignment tool to select the first marker 150 in the photographic image 200 and drag the selection onto the first marker 150 in the image 250, thereby matching the first marker 150 in the photographic image 200 with the first marker 150 in the image 250.
  • techniques for image recognition or pattern recognition can be used to identify (e.g., automatically) first markers and second markers that should be correlated.
  • a disparity between the corresponding markers in each pair of corresponding markers can be reduced (e.g., minimized).
  • a minimization algorithm can be used to reduce the disparity between the 3D positions of the corresponding markers in each pair of corresponding markers.
  • a multidimensional minimization algorithm such as Powell's method, can be used.
  • X 1 can represent a disparity (e.g., an error function) between a first pair of corresponding markers (e.g., 150 in FIG. 2A and 150 in FIG. 2B), and
  • X 2 can represent a disparity between a second pair of corresponding markers (e.g., 160 in FIG. 2A and 160 in FIG. 2B).
  • a minimization algorithm can be used to minimize the disparities.
  • ⁇ k that minimizes f ⁇ j) k _ ⁇ + ⁇ - U k
  • the steps described in this paragraph e.g., beginning with initializing p 0 ) can then be repeated until convergence is achieved, and a location of the minimum of the function z can be determined.
  • Goldfarb-Shanno method can be used.
  • additional alignment can be performed.
  • the alignment tool can allow a user to view the photographic image superimposed on the virtual environment, based on the minimization.
  • the user can further manually align the photographic image with the virtual environment by manually adjusting the placement (e.g., overlay) of the photographic image over the virtual environment.
  • the alignment tool allows a user to reduce the disparity between corresponding markers according to the user's visual inspection of the alignment (and not necessarily minimizing the disparity mathematically).
  • Other implementations are possible.
  • Reducing the disparity between corresponding markers in pairs of corresponding markers can be viewed as a process that is implicit in or that parallels a process of adjusting (e.g., moving) the 3D position of the virtual camera until a superimposition of the markers in the FOV of the physical camera on the markers in the FOV of the virtual camera provides an improved alignment.
  • the positions of markers in the virtual environment can be determined, and the positions of the markers in the virtual environment can then be used in a reverse projection (e.g., an overlay of the photographic image on the virtual environment) to determine a more accurate 3D position of the virtual camera.
  • the determined 3D position of the virtual camera can then be used to define a more accurate 3D position of the physical camera.
  • FIG. 4 is a flow chart of an example process 400 for aligning images.
  • the process 400 includes receiving 410 an inaccurate 3D position of a physical camera positioned above a surface in a physical environment corresponding to an actual 3D position of the physical camera where the physical camera captured a photographic image of the physical environment from the actual 3D position and where the photographic image includes a plurality of markers on or above the surface, each marker having a known 3D position on or above the surface.
  • an alignment tool can receive an inaccurate 3D position of the physical camera 130.
  • the process also includes basing 420 an initial 3D position of a virtual camera in a 3D virtual environment on the inaccurate 3D position of the physical camera where the 3D virtual environment includes a 3D representation of the surface and a representation of the markers on or above the surface according to their known 3D positions, and where the virtual camera's field of view is of a portion of the representation of the surface.
  • the alignment tool can base the initial 3D position of the virtual camera 120 on the inaccurate 3D position of the physical camera 130.
  • the process also includes correlating 430 one or more markers in the photographic image with one or more markers in the 3D virtual environment that appear in the virtual camera's field of view.
  • the alignment tool can correlate the first marker 150 with the second marker 160.
  • the process includes adjusting 440 the initial 3D position of the virtual camera in the 3D virtual environment based on the disparity between the one or more markers' 3D positions in the photographic image as compared to the one or more markers' 3D positions in the virtual camera's field of view.
  • the alignment tool can adjust the initial 3D position of the virtual camera 120.
  • the process can superimpose the photographic image over the portion of the 3D virtual environment in the field of view of the virtual camera.
  • FIG. 5 A is a schematic of an example system for aligning images.
  • a data processing apparatus 510 can include hardware/firmware, an operating system and one or more programs or other software components, including an alignment tool 520.
  • the alignment tool 520 can use the data processing apparatus 510, to effect the operations described in this specification.
  • the alignment tool 510 can include photographic images 512, a 3D model 514 (e.g., a model of a 3D virtual environment), a selection engine 516, a minimization engine 518, and a projection engine 519.
  • the selection engine 516 can be used to select a marker in the photographic image 512 that corresponds to a marker in the 3D model 514.
  • the minimization engine 518 can minimize a disparity between the markers, and the projection engine 519 can determine a 3D position of the physical camera based on the minimized disparity.
  • the software components and engines identified above are described as being separate or distinct, one or more of the components and/or engines may be combined in a single process or routine.
  • the functional description provided herein including separation of responsibility for distinct functions is by way of example. Other groupings or other divisions of functional responsibilities can be made as necessary or in accordance with design preferences.
  • the alignment tool 520 in combination with one or more processors and computer-readable media (e.g., memory), represents one or more structural components in the system 500.
  • the alignment tool 520 can be an application, or a portion thereof.
  • an application is a computer program.
  • An application can be built entirely into the operating system (OS) of the data processing apparatus 510, or an application can have one or more different components located in different locations (e.g., one portion in the OS or kernel mode, one portion in the user mode, and one portion in a remote server), and an application can be built on a runtime library serving as a software platform of the apparatus 510.
  • OS operating system
  • an application can be built on a runtime library serving as a software platform of the apparatus 510.
  • the data processing apparatus 510 includes one or more processors 530 and at least one computer-readable medium 540 (e.g., random access memory, or other storage device).
  • the data processing apparatus 510 can also include a communication interface 550, one or more user interface devices 560, and one or more additional devices 570.
  • the user interface devices 560 can include display screens, keyboards, mouse, stylus, or any combination thereof.
  • the data processing apparatus 510 is operable to perform the operations of process 400, for example.
  • the data processing apparatus 510 can transmit data (e.g., 3D positions, images) or request data over the network 580.
  • the described systems and techniques can be used to determine more accurate positions of stereo cameras in a physical environment, and align stereo pair photographs of the physical environment captured by the stereo cameras with 3D virtual environments.
  • photographic images associated with virtual cameras can be used to simulate movement through a computer generated virtual environment (e.g., a 3D virtual model).
  • the positions of the virtual cameras and the portions of the virtual environment can be used to select shots (e.g., photographic images portraying an associated virtual camera's field of view) to show a first-person view of a golf ball moving through a golf course, for example.
  • shots e.g., photographic images portraying an associated virtual camera's field of view
  • a realism of movement through the virtual environment increases.
  • two frames of an animation can be generated from a first photographic image portraying a first FOV and a second photographic image portraying a second FOV.
  • the amount of change between the position of the first virtual camera and the position of the second virtual camera can, ideally, be minimized.
  • the positions of the virtual cameras and the portions of the virtual environment can be used to select shots to show different angles of the golf ball (e.g., points of view) as it moves through the golf course.
  • a trajectory of the golf ball can be divided into a plurality of portions.
  • the camera angle can show the golf balls movement through the golf course from a view behind the golf ball (e.g., from the perspective of a player standing at the tee).
  • the camera angle can show the golf ball's movement through the golf course from a side view of the golf ball (e.g., from the perspective of a spectator standing on the sidelines in the middle of the course).
  • the camera angle can show the golf ball's movement through the golf course from a view in front of the golf ball (e.g., from the perspective of the hole).
  • the 3D positions of the virtual cameras and portions of the virtual environment in the FOVs of the virtual cameras can be used to select shots, e.g., to show different angles of the virtual environment according to a given position.
  • multiple 3D positions (e.g., different orientations) of a single virtual camera can be used to rotate a view of the virtual environment to see 360 degrees around a single position on the golf course.
  • multiple virtual cameras disposed at different 3D positions around a golf ball for example, can be used to rotate a view to see angles ranging from 0 to 360 degrees around a stationary golf ball.
  • Other implementations are possible.
  • the positions of the virtual cameras can be plotted on a grid that corresponds to the virtual environment.
  • the plotted grid can be used to determine statistics associated with the virtual cameras and the virtual environment, such as, but not limited to virtual camera coverage and density.
  • a FOV of a virtual camera can be calculated based on a 3D position of the virtual camera and the virtual camera's sensor size and focal length.
  • a representation of a FOV of each virtual camera (e.g., 125 and 135 of FIG. 1) can also be superimposed, with each corresponding virtual camera, on the grid to show virtual camera coverage.
  • virtual camera coverage can be defined by an amount, or area, of the grid that has been captured in a photographic image.
  • Virtual camera density can be defined, for example, as a number of virtual cameras plotted within a predetermined area of the grid (e.g., a predetermined number of cells).
  • Embodiments of the subject matter and the functional operations described in this specification can be implemented in digital electronic circuitry, or in computer software, firmware, or hardware, including the structures disclosed in this specification and their structural equivalents, or in combinations of one or more of them.
  • Embodiments of the subject matter described in this specification can be implemented as one or more computer program products, i.e., one or more modules of computer program instructions encoded on a tangible program carrier for execution by, or to control the operation of, data processing apparatus.
  • the tangible program carrier can be a computer-readable medium.
  • the computer-readable medium can be a machine-readable storage device, a machine-readable storage substrate, a memory device, or a combination of one or more of them.
  • data processing apparatus encompasses all apparatus, devices, and machines for processing data, including by way of example a programmable processor, a computer, or multiple processors or computers.
  • the apparatus can include, in addition to hardware, code that creates an execution environment for the computer program in question, e.g., code that constitutes processor firmware, a protocol stack, a database management system, an operating system, or a combination of one or more of them.
  • a computer program (also known as a program, software, software application, script, or code) can be written in any form of programming language, including compiled or interpreted languages, or declarative or procedural languages, and it can be deployed in any form, including as a stand-alone program or as a module, component, subroutine, or other unit suitable for use in a computing environment.
  • a computer program does not necessarily correspond to a file in a file system.
  • a program can be stored in a portion of a file that holds other programs or data (e.g., one or more scripts stored in a markup language document), in a single file dedicated to the program in question, or in multiple coordinated files (e.g., files that store one or more modules, sub-programs, or portions of code).
  • a computer program can be deployed to be executed on one computer or on multiple computers that are located at one site or distributed across multiple sites and interconnected by a communication network.
  • the processes and logic flows described in this specification can be performed by one or more programmable processors executing one or more computer programs to perform functions by operating on input data and generating output.
  • the processes and logic flows can also be performed by, and apparatus can also be implemented as, special purpose logic circuitry, e.g., an FPGA (field programmable gate array) or an ASIC (application-specific integrated circuit).
  • processors suitable for the execution of a computer program include, by way of example, both general and special purpose microprocessors, and any one or more processors of any kind of digital computer.
  • a processor will receive instructions and data from a read-only memory or a random access memory or both.
  • the essential elements of a computer are a processor for performing instructions and one or more memory devices for storing instructions and data.
  • a computer will also include, or be operatively coupled to receive data from or transfer data to, or both, one or more mass storage devices for storing data, e.g., magnetic, magneto-optical disks, or optical disks.
  • mass storage devices for storing data, e.g., magnetic, magneto-optical disks, or optical disks.
  • a computer need not have such devices.
  • a computer can be embedded in another device, e.g., a mobile telephone, a personal digital assistant (PDA), a mobile audio or video player, a game console, a Global Positioning System (GPS) receiver, to name just a few.
  • PDA personal digital assistant
  • GPS Global Positioning System
  • Computer-readable media suitable for storing computer program instructions and data include all forms of non- volatile memory, media and memory devices, including by way of example semiconductor memory devices, e.g., EPROM, EEPROM, and flash memory devices; magnetic disks, e.g., internal hard disks or removable disks; magneto-optical disks; and CD-ROM and DVD-ROM disks.
  • semiconductor memory devices e.g., EPROM, EEPROM, and flash memory devices
  • magnetic disks e.g., internal hard disks or removable disks
  • magneto-optical disks e.g., CD-ROM and DVD-ROM disks.
  • the processor and the memory can be supplemented by, or incorporated in, special purpose logic circuitry.
  • embodiments of the subject matter described in this specification can be implemented on a computer having a display device, e.g., a CRT
  • a keyboard and a pointing device e.g., a mouse or a trackball, by which the user can provide input to the computer.
  • Other kinds of devices can be used to provide for interaction with a user as well; for example, feedback provided to the user can be any form of sensory feedback, e.g., visual feedback, auditory feedback, or tactile feedback; and input from the user can be received in any form, including acoustic, speech, or tactile input.
  • Embodiments of the subject matter described in this specification can be implemented in a computing system that includes a back-end component, e.g., as a data server, or that includes a middleware component, e.g., an application server, or that includes a front-end component, e.g., a client computer having a graphical user interface or a Web browser through which a user can interact with an implementation of the subject matter described is this specification, or any combination of one or more such back-end, middleware, or front-end components.
  • the components of the system can be interconnected by any form or medium of digital data communication, e.g., a communication network. Examples of communication networks include a local area network ("LAN”) and a wide area network
  • WAN wide area network
  • the computing system can include clients and servers.
  • a client and server are generally remote from each other and typically interact through a communication network.
  • the relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other.

Abstract

Methods, systems, and apparatus, including computer program products, for aligning images are disclosed. In one aspect, a method includes receiving an inaccurate three dimensional (3D) position of a physical camera, where the physical camera captured a photographic image; basing an initial 3D position of a virtual camera in a 3D virtual environment on the inaccurate 3D position of the physical camera; correlating one or more markers in the photographic image with one or more markers in the 3D virtual environment that appear in the virtual camera's field of view; and adjusting the initial 3D position of the virtual camera in the 3D virtual environment based on a disparity between the one or more markers' 3D positions in the photographic image as compared to the one or more markers' 3D positions in the virtual camera's field of view.

Description

POSITION ESTIMATION REFINEMENT
BACKGROUND
Electronic games and other types of simulations recreate real world environments such as baseball diamonds, race tracks, and golf courses through three-dimensional (3D) computer generated graphics. However, such graphics can typically create unnatural visual artifacts such as repeating patterns which detract from the intended realism of the imagery. Some electronic games may use a photograph of an actual location as a background, such as mountains, with computer generated graphics rendered in the foreground. However, such graphics rendered in the foreground can still detract from the intended realism of the imagery. A 3D virtual environment, or a three-dimensional computer generated representation of a physical environment that can portray, for example, topologies, landmarks, and positions thereof, can be used in simulations. Superimposing photographs on the 3D virtual environment can improve the intended realism of the simulation. Positions of cameras used to capture the photographs can be used to perform the superimposing. Ideally, the positions of the cameras should be determined as accurately as possible.
SUMMARY
This specification relates to aligning images. In some implementations, photographic images (e.g., two dimensional (2D) photographs) can be superimposed on 3D computer generated graphics. In general, one aspect of the subject matter described in this specification can be embodied in methods that include the actions of receiving an inaccurate three-dimensional (3D) position of a physical camera positioned above a surface in a physical environment corresponding to an actual 3D position of the physical camera where the physical camera captured a photographic image of the physical environment from the actual 3D position and where the photographic image includes a plurality of markers on or above the surface, each marker having a known 3D position on or above the surface; basing an initial 3D position of a virtual camera in a 3D virtual environment on the inaccurate 3D position of the physical camera where the 3D virtual environment includes a 3D representation of the surface and a representation of the markers on or above the surface according to their known 3D positions, and where the virtual camera' s field of view is of a portion of the representation of the surface; correlating one or more markers in the photographic image with one or more markers in the 3D virtual environment that appear in the virtual camera's field of view; and adjusting the initial 3D position of the virtual camera in the 3D virtual environment based on a disparity between the one or more markers' 3D positions in the photographic image as compared to the one or more markers' 3D positions in the virtual camera's field of view. Other embodiments of this aspect include corresponding systems, apparatus, and computer program products.
These and other embodiments can optionally include one or more of the following features. The photographic image is a two-dimensional (2D) photographic image. The 3D position of the physical camera is defined by at least one of position data and attitude data. The one or more markers in the photographic image are visual markers. The one or more markers in the photographic image are virtual markers that are generated after the photographic image is captured.
The method further includes superimposing the photographic image over the portion of the 3D virtual environment in the field of view of the virtual camera. The method further includes receiving input matching one or more of the one or more markers in the photographic image to one or more of the one or more markers in the virtual camera's field of view. The correlating includes: minimizing the disparity based on the matching, producing a minimized disparity; and generating a reverse projection of an adjusted 3D position of the virtual camera based on the minimized disparity. The method further includes generating a grid of the 3D virtual environment that includes the virtual camera configured according to the reverse projection. The method further includes calculating the field of view of the virtual camera; and displaying a representation of the field of view on the grid.
Particular embodiments of the subject matter described in this specification can be implemented to realize one or more of the following advantages. Correlating markers in photographic images with markers in 3D virtual environments increases an accuracy of image alignment (e.g., proper position, orientation, and scale). Correlating the markers allows a 3D position of a virtual camera to be calibrated, thereby increasing an accuracy of the 3D position of the virtual camera. As a result, a realism of real world environment re-creations is increased and a user's experience is enhanced by providing the user a simulation that gives the user an impression that the 3D virtual environment was generated using 3D photography. As another result, a quality of shot sequences, based on portions of a 3D virtual environment in fields of view (FOV) of one or more virtual cameras in the 3D virtual environment, can be improved.
The details of one or more embodiments of the subject matter described in this specification are set forth in the accompanying drawings and the description below. Other features, aspects, and advantages of the subject matter will become apparent from the description, the drawings, and the claims.
BRIEF DESCRIPTION OF THE DRAWINGS FIG. 1 illustrates an example course grid. FIG. 2A is a photographic image of an example view of a physical environment.
FIG. 2B is an image of an example view of a 3D virtual environment. FIG. 3 illustrates the photograph of FIG. 2A superimposed on the image of FIG. 2B. FIG. 4 is a flow chart of an example process for aligning images. FIG. 5A is a schematic of an example system for aligning images. FIG. 5B is a schematic of an example alignment tool.
Like reference numbers and designations in the various drawings indicate like elements.
DETAILED DESCRIPTION
Various implementations recreate a 3D virtual environment (e.g., a computer simulated golf course) utilizing digital representations of actual photographs of a corresponding physical environment combined with computer generated 2D and 3D topographies in a virtual environment. Portions of the virtual environment can be associated with actual photographs captured by one or more physical cameras that correspond to one or more virtual cameras positioned in the virtual environment. Adjusting positions of the one or more virtual cameras can be used to align digital representations of the actual photographs with computer generated 2D and 3D topographies.
FIG. 1 illustrates an example course grid 100. In some implementations, the course grid 100 can be a grid that divides a hole of a golf course (e.g., a first hole, a second hole, an eighteenth whole), for example, into a plurality of cells (e.g., a cell 110). Each cell can define a physical area of the course that can be photographed for use, for example, in the simulation of a virtual golf course. Each cell can have one or more photographs associated with the cell. For example, the one or more photographs can include views of different angles of the portion of the golf course (e.g., views around 360 degrees from any point above or on the cell) corresponding to the area of the cell. Each photograph associated with the cell can be associated with a virtual camera that corresponds to a physical camera that captured the photograph. For example, the cell 110 includes a virtual camera 120. The virtual camera 120 can be disposed on the course grid 100 according to an associated 3D position. The 3D position can include a location of the virtual camera 120 (e.g., x-, y-, and z- coordinates) and an orientation (e.g., attitude, pitch) of the virtual camera 120. Initially, the 3D position can be determined from a measured 3D position of a corresponding physical camera in the real world environment.
The measured 3D position of the corresponding physical camera can be determined using global positioning system (GPS) data provided by GPS devices. For example, the physical cameras can include GPS devices that capture GPS data for the position and orientation of the physical camera used to capture each photograph. As another example, a GPS device that is external to the physical camera can be used to capture the GPS data.
The measured 3D position of a corresponding physical camera can be inaccurate. For example, the GPS device may provide GPS data that indicates a measured 3D position of the physical camera that is different from the actual 3D position of the physical camera. Returning to FIG. 1, for example, the 3D position determined from the GPS data can be used to dispose a corresponding virtual camera 120 at the 3D position in the cell 1 10. However, the actual 3D location of the physical camera 130 would indicate that the corresponding virtual camera should have a different 3D position in the cell 110. In practice, the virtual camera 120 should have the same 3D position as the actual 3D location of the physical camera 130.
Because the 3D position of the virtual camera 120 is based on an inaccurate measured 3D position of the physical camera 130, the virtual camera 120 also has a field of view 125 that is different from the field of view 135 of the physical camera 130. In other words, a first field of view associated with the virtual camera 120 and a second field of view associated with the physical camera 130 would portray different portions of a representation of the surface of the golf course. The first field of view associated with the virtual camera portrays what a camera would see at the measured 3D position, while the second field of view associated with the physical camera portrays what a camera would see at another 3D position (because the physical camera is not actually disposed at the inaccurate measured 3D position).
Accurately superimposing a photographic image onto a 3D virtual environment, and thereby determining an accurate location of the physical camera, can be facilitated by aligning markers in the photographic image with corresponding markers in the 3D virtual environment.
The physical environment can include markers (e.g., stationary objects such as sprinkler heads, stones, and trees) on or above the surface of the environment. Markers can also be introduced to the environment, on or above the environment. For example, colored cones or washers can be disposed at certain positions in the physical environment. In some implementations, the markers can be visible markers (e.g., the colored cones, or objects on or above the golf course such as sprinkler heads). In some implementations, the markers can be virtual markers. For example, a virtual marker can be generated by using radar to determine a 3D position, on or above the surface of the physical environment, that can be represented by a virtual marker, and metadata can be added to photograph to indicate the virtual marker in a digital representation of the photograph.
Because the virtual camera 120 and the physical camera 130 portray different portions of the virtual environment, alignment of a photographic image captured by the physical camera 130 using a corresponding position (and FOV) of the virtual camera 120 can be inaccurate. A field of view can be defined as a portion of an observable environment that is seen at any given moment by a camera, or an angle extent thereof. The photographic image represents the FOV of the physical camera 130, which is, in reality, different from the FOV of the virtual camera 120. Therefore, the photographic image may not be aligned properly with the virtual environment (e.g., accurately superimposed on the 3D topography).
As shown in FIG. 1 , the FOV 135 of the physical camera 130 includes a first object 150 (e.g., a marker represented by a circle) and a second object 160 (e.g., a marker represented by a triangle). Because of an inaccurate measured 3D position used to generate the virtual camera 120, the FOV 125 of the virtual camera 125 does not include the second object 160. Mapping a photographic image captured by the physical camera 130 using the inaccurate 3D position used to generate the virtual camera 120 would produce an inaccurate alignment (e.g., superimposition) of the photographic image on the virtual environment.
3D positions can be determined for each marker. In some implementations, measured 3D positions of the markers can be more accurate (e.g., a disparity of less than 3 inches from the actual location of the marker in the physical environment) than measured 3D positions of the physical cameras. For example, high-precision GPS devices can be used to determine the 3D positions of the markers.
FIG. 2A is a photographic image 200 of an example view of a physical environment. The photographic image 200 shows the field of view 135 of the physical camera 130. The photographic image 200 includes the markers 150 and 160 in the field of view 135. FIG. 2B is an image 250 of an example view of a 3D virtual environment. The view of the 3D virtual environment corresponds to the field of view 125 of the virtual camera 120. As shown in FIG. 2B, the image 250 does not include the marker 160. Ideally, the photographic image 200 would show the same view as in the image 250. However, because the measured 3D position of the physical camera 130 may be inaccurate (e.g., a disparity of 3 inches or more from the actual location of the physical camera in the physical environment), the FOV 135 of the physical camera 130 is not identical to the FOV 125 of the virtual camera 120.
FIG. 3 illustrates the photograph of FIG. 2A superimposed on the image of FIG. 2B. In particular, FIG. 3 illustrates a simple superimposition, where an alignment includes aligning the edges of the images (e.g., the edges of the FOVs). If the photographic image 200 is superimposed on the portion of the 3D virtual environment represented by the image 250, the terrain illustrated by photographic image 200 would not accurately align with the topography of the portion of the 3D virtual environment represented by the image 250. In addition, a first depiction of the first marker 150 (as represented by the circle with a solid boundary) in the photographic image would not align properly with a second depiction of the first marker 150 (as represented by the circle with a dotted boundary). Similarly, depictions of the terrain (e.g. a hill) also would not align properly. Note that a second depiction of the second marker 160 is not visible. If the second marker 160 were an actual object (e.g., a bush) to be included in the golf course, there would not be a photographic image, or portion of a photographic image, to superimpose on a corresponding object in the 3D virtual environment.
Objects and/or markers can be aligned to increase the accuracy of image alignment. In some implementations, the markers can be aligned by correlating markers that should correspond to each other. Markers that correspond to each other can be identified and correlated with each other.
In some implementations, an alignment tool can be used to facilitate selection of corresponding markers. For example, a user can use the alignment tool to select (e.g., click on) the first marker 150 in the photographic image 200 and to select the first marker 150 in the image 250. The user can use a secondary action to correlate (e.g., match) the two selections. For example, the user can press one or more keys on a keyboard (e.g., CTRL + C). As another example, a user can use the alignment tool to select the first marker 150 in the photographic image 200 and drag the selection onto the first marker 150 in the image 250, thereby matching the first marker 150 in the photographic image 200 with the first marker 150 in the image 250. Other implementations are possible. For example, techniques for image recognition or pattern recognition can be used to identify (e.g., automatically) first markers and second markers that should be correlated.
After at least one pair of corresponding markers is identified (e.g., by a user or a recognition technique), a disparity between the corresponding markers in each pair of corresponding markers can be reduced (e.g., minimized). A minimization algorithm can be used to reduce the disparity between the 3D positions of the corresponding markers in each pair of corresponding markers. For example, in some implementations, a multidimensional minimization algorithm, such as Powell's method, can be used. In particular, the disparities in the 3D positions between corresponding markers in each pair of corresponding markers can be represented by a function z = /(x) = /(JC1 1 X2,..., *,, ) . Returning to the example illustrated in FIGS. 2A and 2B, n can represent a number of pairs of corresponding markers (e.g., « = 2) that have been identified. X1 can represent a disparity (e.g., an error function) between a first pair of corresponding markers (e.g., 150 in FIG. 2A and 150 in FIG. 2B), and X2 can represent a disparity between a second pair of corresponding markers (e.g., 160 in FIG. 2A and 160 in FIG. 2B). A minimization algorithm can be used to minimize the disparities.
In some implementations that use Powell's method, X0 can be an initial guess at a location of the minimum of the function z = f(x) = f(xl , x2,..., xn ) . Base vectors Ek = (0, ... ,\k ,0, ... ,0) can be generated for k = 1 , 2, 3, . .. , n . Vectors JJ k can be initialized such that JJ k = Ek for k = 1, 2, 3, ... , «. The transpose of JJ k , or JJ k , can be used as the columns of a matrix U, such that JJ = \U[ , U2 ,... ,Un ' ] . A counter i can be initialized to zero (i.e., i = 0). A starting point p0 can be initialized such that p0 = X1 . For k = 1 , 2, 3, . .., n, the value of γ = γk that minimizes f{j)k_γ + γ - Uk) can be determined. Then, pk can be set such that pk = pk ϊ + γ ■ JJ k . In addition, JJ can be set such that JJ = U 1 forj = 1 ,2,3,. ..,n-\; and Un can be set such that JJn = pn - p0. The counter / can be incremented (i.e., i = /+1).
Finally, a value of γ = γmm can be determined that minimizes f(p0 + γ - Un) ; and X1 can be set such that X1 = p0 + γmm ■ Un . The steps described in this paragraph (e.g., beginning with initializing p0 ) can then be repeated until convergence is achieved, and a location of the minimum of the function z can be determined.
Other implementations are possible. For example, other minimization algorithms, such as, but not limited to Newton's method, quasi-Newton methods, and Broyden-Fletcher-
Goldfarb-Shanno method can be used.
In some implementations, additional alignment can be performed. For example, the alignment tool can allow a user to view the photographic image superimposed on the virtual environment, based on the minimization. The user can further manually align the photographic image with the virtual environment by manually adjusting the placement (e.g., overlay) of the photographic image over the virtual environment. In this manner, the alignment tool allows a user to reduce the disparity between corresponding markers according to the user's visual inspection of the alignment (and not necessarily minimizing the disparity mathematically). Other implementations are possible.
Reducing the disparity between corresponding markers in pairs of corresponding markers can be viewed as a process that is implicit in or that parallels a process of adjusting (e.g., moving) the 3D position of the virtual camera until a superimposition of the markers in the FOV of the physical camera on the markers in the FOV of the virtual camera provides an improved alignment. After disparities between corresponding markers are minimized, the positions of markers in the virtual environment can be determined, and the positions of the markers in the virtual environment can then be used in a reverse projection (e.g., an overlay of the photographic image on the virtual environment) to determine a more accurate 3D position of the virtual camera. The determined 3D position of the virtual camera can then be used to define a more accurate 3D position of the physical camera.
FIG. 4 is a flow chart of an example process 400 for aligning images. The process 400 includes receiving 410 an inaccurate 3D position of a physical camera positioned above a surface in a physical environment corresponding to an actual 3D position of the physical camera where the physical camera captured a photographic image of the physical environment from the actual 3D position and where the photographic image includes a plurality of markers on or above the surface, each marker having a known 3D position on or above the surface. For example, an alignment tool can receive an inaccurate 3D position of the physical camera 130.
The process also includes basing 420 an initial 3D position of a virtual camera in a 3D virtual environment on the inaccurate 3D position of the physical camera where the 3D virtual environment includes a 3D representation of the surface and a representation of the markers on or above the surface according to their known 3D positions, and where the virtual camera's field of view is of a portion of the representation of the surface. For example, the alignment tool can base the initial 3D position of the virtual camera 120 on the inaccurate 3D position of the physical camera 130.
The process also includes correlating 430 one or more markers in the photographic image with one or more markers in the 3D virtual environment that appear in the virtual camera's field of view. For example, the alignment tool can correlate the first marker 150 with the second marker 160. Furthermore, the process includes adjusting 440 the initial 3D position of the virtual camera in the 3D virtual environment based on the disparity between the one or more markers' 3D positions in the photographic image as compared to the one or more markers' 3D positions in the virtual camera's field of view. For example, the alignment tool can adjust the initial 3D position of the virtual camera 120. In some implementations, based on the adjusted 3D position of the virtual camera, the process can superimpose the photographic image over the portion of the 3D virtual environment in the field of view of the virtual camera.
FIG. 5 A is a schematic of an example system for aligning images. A data processing apparatus 510 can include hardware/firmware, an operating system and one or more programs or other software components, including an alignment tool 520. The alignment tool 520 can use the data processing apparatus 510, to effect the operations described in this specification. Temporarily referring to FIG. 5B, the alignment tool 510 can include photographic images 512, a 3D model 514 (e.g., a model of a 3D virtual environment), a selection engine 516, a minimization engine 518, and a projection engine 519. In some implementations, the selection engine 516 can be used to select a marker in the photographic image 512 that corresponds to a marker in the 3D model 514. The minimization engine 518 can minimize a disparity between the markers, and the projection engine 519 can determine a 3D position of the physical camera based on the minimized disparity. Though the software components and engines identified above are described as being separate or distinct, one or more of the components and/or engines may be combined in a single process or routine. The functional description provided herein including separation of responsibility for distinct functions is by way of example. Other groupings or other divisions of functional responsibilities can be made as necessary or in accordance with design preferences.
Returning to FIG. 5A, the alignment tool 520, in combination with one or more processors and computer-readable media (e.g., memory), represents one or more structural components in the system 500. The alignment tool 520 can be an application, or a portion thereof. As used here, an application is a computer program. An application can be built entirely into the operating system (OS) of the data processing apparatus 510, or an application can have one or more different components located in different locations (e.g., one portion in the OS or kernel mode, one portion in the user mode, and one portion in a remote server), and an application can be built on a runtime library serving as a software platform of the apparatus 510.
The data processing apparatus 510 includes one or more processors 530 and at least one computer-readable medium 540 (e.g., random access memory, or other storage device). The data processing apparatus 510 can also include a communication interface 550, one or more user interface devices 560, and one or more additional devices 570. The user interface devices 560 can include display screens, keyboards, mouse, stylus, or any combination thereof. Once programmed, the data processing apparatus 510 is operable to perform the operations of process 400, for example. In some implementations, the data processing apparatus 510 can transmit data (e.g., 3D positions, images) or request data over the network 580.
Other implementations and applications of the described systems and techniques are possible. For example, the described systems and techniques can be used to determine more accurate positions of stereo cameras in a physical environment, and align stereo pair photographs of the physical environment captured by the stereo cameras with 3D virtual environments. As another example, photographic images associated with virtual cameras can be used to simulate movement through a computer generated virtual environment (e.g., a 3D virtual model).
In some implementations, the positions of the virtual cameras and the portions of the virtual environment can be used to select shots (e.g., photographic images portraying an associated virtual camera's field of view) to show a first-person view of a golf ball moving through a golf course, for example. As the accuracy of the 3D positions of the virtual cameras increase, a realism of movement through the virtual environment increases. For example, two frames of an animation can be generated from a first photographic image portraying a first FOV and a second photographic image portraying a second FOV. In order to generate a smooth transition between the two frames, the amount of change between the position of the first virtual camera and the position of the second virtual camera can, ideally, be minimized.
In some implementations, the positions of the virtual cameras and the portions of the virtual environment can be used to select shots to show different angles of the golf ball (e.g., points of view) as it moves through the golf course. For example, a trajectory of the golf ball can be divided into a plurality of portions. In a first portion, the camera angle can show the golf balls movement through the golf course from a view behind the golf ball (e.g., from the perspective of a player standing at the tee). In a second portion, the camera angle can show the golf ball's movement through the golf course from a side view of the golf ball (e.g., from the perspective of a spectator standing on the sidelines in the middle of the course). In a third portion, the camera angle can show the golf ball's movement through the golf course from a view in front of the golf ball (e.g., from the perspective of the hole).
In some implementations, the 3D positions of the virtual cameras and portions of the virtual environment in the FOVs of the virtual cameras can be used to select shots, e.g., to show different angles of the virtual environment according to a given position. For example, multiple 3D positions (e.g., different orientations) of a single virtual camera can be used to rotate a view of the virtual environment to see 360 degrees around a single position on the golf course. As another example, multiple virtual cameras disposed at different 3D positions around a golf ball, for example, can be used to rotate a view to see angles ranging from 0 to 360 degrees around a stationary golf ball. Other implementations are possible.
In some implementations, the positions of the virtual cameras can be plotted on a grid that corresponds to the virtual environment. The plotted grid can be used to determine statistics associated with the virtual cameras and the virtual environment, such as, but not limited to virtual camera coverage and density. A FOV of a virtual camera can be calculated based on a 3D position of the virtual camera and the virtual camera's sensor size and focal length. A representation of a FOV of each virtual camera (e.g., 125 and 135 of FIG. 1) can also be superimposed, with each corresponding virtual camera, on the grid to show virtual camera coverage. For example virtual camera coverage can be defined by an amount, or area, of the grid that has been captured in a photographic image. Virtual camera density can be defined, for example, as a number of virtual cameras plotted within a predetermined area of the grid (e.g., a predetermined number of cells).
Embodiments of the subject matter and the functional operations described in this specification can be implemented in digital electronic circuitry, or in computer software, firmware, or hardware, including the structures disclosed in this specification and their structural equivalents, or in combinations of one or more of them. Embodiments of the subject matter described in this specification can be implemented as one or more computer program products, i.e., one or more modules of computer program instructions encoded on a tangible program carrier for execution by, or to control the operation of, data processing apparatus. The tangible program carrier can be a computer-readable medium. The computer-readable medium can be a machine-readable storage device, a machine-readable storage substrate, a memory device, or a combination of one or more of them.
The term "data processing apparatus" encompasses all apparatus, devices, and machines for processing data, including by way of example a programmable processor, a computer, or multiple processors or computers. The apparatus can include, in addition to hardware, code that creates an execution environment for the computer program in question, e.g., code that constitutes processor firmware, a protocol stack, a database management system, an operating system, or a combination of one or more of them.
A computer program (also known as a program, software, software application, script, or code) can be written in any form of programming language, including compiled or interpreted languages, or declarative or procedural languages, and it can be deployed in any form, including as a stand-alone program or as a module, component, subroutine, or other unit suitable for use in a computing environment. A computer program does not necessarily correspond to a file in a file system. A program can be stored in a portion of a file that holds other programs or data (e.g., one or more scripts stored in a markup language document), in a single file dedicated to the program in question, or in multiple coordinated files (e.g., files that store one or more modules, sub-programs, or portions of code). A computer program can be deployed to be executed on one computer or on multiple computers that are located at one site or distributed across multiple sites and interconnected by a communication network. The processes and logic flows described in this specification can be performed by one or more programmable processors executing one or more computer programs to perform functions by operating on input data and generating output. The processes and logic flows can also be performed by, and apparatus can also be implemented as, special purpose logic circuitry, e.g., an FPGA (field programmable gate array) or an ASIC (application-specific integrated circuit).
Processors suitable for the execution of a computer program include, by way of example, both general and special purpose microprocessors, and any one or more processors of any kind of digital computer. Generally, a processor will receive instructions and data from a read-only memory or a random access memory or both. The essential elements of a computer are a processor for performing instructions and one or more memory devices for storing instructions and data. Generally, a computer will also include, or be operatively coupled to receive data from or transfer data to, or both, one or more mass storage devices for storing data, e.g., magnetic, magneto-optical disks, or optical disks. However, a computer need not have such devices. Moreover, a computer can be embedded in another device, e.g., a mobile telephone, a personal digital assistant (PDA), a mobile audio or video player, a game console, a Global Positioning System (GPS) receiver, to name just a few.
Computer-readable media suitable for storing computer program instructions and data include all forms of non- volatile memory, media and memory devices, including by way of example semiconductor memory devices, e.g., EPROM, EEPROM, and flash memory devices; magnetic disks, e.g., internal hard disks or removable disks; magneto-optical disks; and CD-ROM and DVD-ROM disks. The processor and the memory can be supplemented by, or incorporated in, special purpose logic circuitry.
To provide for interaction with a user, embodiments of the subject matter described in this specification can be implemented on a computer having a display device, e.g., a CRT
(cathode ray tube) or LCD (liquid crystal display) monitor, for displaying information to the user and a keyboard and a pointing device, e.g., a mouse or a trackball, by which the user can provide input to the computer. Other kinds of devices can be used to provide for interaction with a user as well; for example, feedback provided to the user can be any form of sensory feedback, e.g., visual feedback, auditory feedback, or tactile feedback; and input from the user can be received in any form, including acoustic, speech, or tactile input.
Embodiments of the subject matter described in this specification can be implemented in a computing system that includes a back-end component, e.g., as a data server, or that includes a middleware component, e.g., an application server, or that includes a front-end component, e.g., a client computer having a graphical user interface or a Web browser through which a user can interact with an implementation of the subject matter described is this specification, or any combination of one or more such back-end, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication, e.g., a communication network. Examples of communication networks include a local area network ("LAN") and a wide area network
("WAN"), e.g., the Internet.
The computing system can include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other.
While this specification contains many specific implementation details, these should not be construed as limitations on the scope of any implementation or of what may be claimed, but rather as descriptions of features that may be specific to particular embodiments of particular implementations. Certain features that are described in this specification in the context of separate embodiments can also be implemented in combination in a single embodiment. Conversely, various features that are described in the context of a single embodiment can also be implemented in multiple embodiments separately or in any suitable subcombination. Moreover, although features may be described above as acting in certain combinations and even initially claimed as such, one or more features from a claimed combination can in some cases be excised from the combination, and the claimed combination may be directed to a subcombination or variation of a subcombination.
Similarly, while operations are depicted in the drawings in a particular order, this should not be understood as requiring that such operations be performed in the particular order shown or in sequential order, or that all illustrated operations be performed, to achieve desirable results. In certain circumstances, multitasking and parallel processing may be advantageous. Moreover, the separation of various system components in the embodiments described above should not be understood as requiring such separation in all embodiments, and it should be understood that the described program components and systems can generally be integrated together in a single software product or packaged into multiple software products.
Particular embodiments of the subject matter described in this specification have been described. Other embodiments are within the scope of the following claims. For example, the actions recited in the claims can be performed in a different order and still achieve desirable results. As one example, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In certain implementations, multitasking and parallel processing may be advantageous.
What is claimed is:

Claims

1. A computer-implemented method, comprising: receiving an inaccurate three-dimensional (3D) position of a physical camera positioned above a surface in a physical environment corresponding to an actual 3D position of the physical camera where the physical camera captured a photographic image of the physical environment from the actual 3D position and where the photographic image includes a plurality of markers on or above the surface, each marker having a known 3D position on or above the surface; basing an initial 3D position of a virtual camera in a 3D virtual environment on the inaccurate 3D position of the physical camera where the 3D virtual environment includes a 3D representation of the surface and a representation of the markers on or above the surface according to their known 3D positions, and where the virtual camera's field of view is of a portion of the representation of the surface; correlating one or more markers in the photographic image with one or more markers in the 3D virtual environment that appear in the virtual camera's field of view; and adjusting the initial 3D position of the virtual camera in the 3D virtual environment based on a disparity between the one or more markers' 3D positions in the photographic image as compared to the one or more markers' 3D positions in the virtual camera's field of view.
2. The method of claim 1, where the photographic image is a two-dimensional (2D) photographic image.
3. The method of claim 1, where the 3D position of the physical camera is defined by at least one of position data and attitude data.
4. The method of claim 1 , where the one or more markers in the photographic image are visual markers.
5. The method of claim 1 , where the one or more markers in the photographic image are virtual markers that are generated after the photographic image is captured.
6. The method of claim 1, further comprising: superimposing the photographic image over the portion of the 3D virtual environment in the field of view of the virtual camera.
7. The method of claim 1 , further comprising: receiving input matching one or more of the one or more markers in the photographic image to one or more of the one or more markers in the virtual camera's field of view.
8. The method of claim 7, where the correlating includes: minimizing the disparity based on the matching, producing a minimized disparity; and generating a reverse projection of an adjusted 3D position of the virtual camera based on the minimized disparity.
9. The method of claim 8, further comprising: generating a grid of the 3D virtual environment that includes the virtual camera configured according to the reverse projection.
10. The method of claim 9, further comprising: calculating the field of view of the virtual camera; and displaying a representation of the field of view on the grid.
11. A computer program product, encoded on a computer-readable medium, operable to cause one or more processors to perform operations comprising: receiving an inaccurate three-dimensional (3D) position of a physical camera positioned above a surface in a physical environment corresponding to an actual 3D position of the physical camera where the physical camera captured a photographic image of the physical environment from the actual 3D position and where the photographic image includes a plurality of markers on or above the surface, each marker having a known 3D position on or above the surface; basing an initial 3D position of a virtual camera in a 3D virtual environment on the inaccurate 3D position of the physical camera where the 3D virtual environment includes a 3D representation of the surface and a representation of the markers on or above the surface according to their known 3D positions, and where the virtual camera's field of view is of a portion of the representation of the surface; correlating one or more markers in the photographic image with one or more markers in the 3D virtual environment that appear in the virtual camera's field of view; and adjusting the initial 3D position of the virtual camera in the 3D virtual environment based on a disparity between the one or more markers' 3D positions in the photographic image as compared to the one or more markers' 3D positions in the virtual camera's field of view.
12. The computer program product of claim 11 , where the photographic image is a two-dimensional (2D) photographic image.
13. The computer program product of claim 11 , where the 3D position of the physical camera is defined by at least one of position data and attitude data.
14. The computer program product of claim 11 , where the one or more markers in the photographic image are visual markers.
15. The computer program product of claim 11 , where the one or more markers in the photographic image are virtual markers that are generated after the photographic image is captured.
16. The computer program product of claim 11, operable to cause one or more processors to perform operations further comprising: superimposing the photographic image over the portion of the 3D virtual environment in the field of view of the virtual camera.
17. The computer program product of claim 11, operable to cause one or more processors to perform operations further comprising: receiving input matching one or more of the one or more markers in the photographic image to one or more of the one or more markers in the virtual camera's field of view.
18. The computer program product of claim 17, where the correlating includes: minimizing the disparity based on the matching, producing a minimized disparity; and generating a reverse projection of an adjusted 3D position of the virtual camera based on the minimized disparity.
19. The computer program product of claim 18, operable to cause one or more processors to perform operations further comprising: generating a grid of the 3D virtual environment that includes the virtual camera configured according to the reverse projection.
20. The computer program product of claim 19, operable to cause one or more processors to perform operations further comprising: calculating the field of view of the virtual camera; and displaying a representation of the field of view on the grid.
21. A system comprising: a display device; a machine-readable storage device including a program product; and one or more computers operable to execute the program product, interact with the display device, and perform operations comprising: receiving an inaccurate three-dimensional (3D) position of a physical camera positioned above a surface in a physical environment corresponding to an actual 3D position of the physical camera where the physical camera captured a photographic image of the physical environment from the actual 3D position and where the photographic image includes a plurality of markers on or above the surface, each marker having a known 3D position on or above the surface; basing an initial 3D position of a virtual camera in a 3D virtual environment on the inaccurate 3D position of the physical camera where the 3D virtual environment includes a 3D representation of the surface and a representation of the markers on or above the surface according to their known 3D positions, and where the virtual camera's field of view is of a portion of the representation of the surface; correlating one or more markers in the photographic image with one or more markers in the 3D virtual environment that appear in the virtual camera's field of view; and adjusting the initial 3D position of the virtual camera in the 3D virtual environment based on a disparity between the one or more markers' 3D positions in the photographic image as compared to the one or more markers' 3D positions in the virtual camera's field of view.
PCT/US2009/031335 2009-01-16 2009-01-16 Position estimation refinement WO2010082933A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/US2009/031335 WO2010082933A1 (en) 2009-01-16 2009-01-16 Position estimation refinement

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/US2009/031335 WO2010082933A1 (en) 2009-01-16 2009-01-16 Position estimation refinement

Publications (1)

Publication Number Publication Date
WO2010082933A1 true WO2010082933A1 (en) 2010-07-22

Family

ID=40790756

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2009/031335 WO2010082933A1 (en) 2009-01-16 2009-01-16 Position estimation refinement

Country Status (1)

Country Link
WO (1) WO2010082933A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2615580A1 (en) 2012-01-13 2013-07-17 Softkinetic Software Automatic scene calibration

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020084974A1 (en) * 1997-09-01 2002-07-04 Toshikazu Ohshima Apparatus for presenting mixed reality shared among operators
US6990492B2 (en) * 1998-11-05 2006-01-24 International Business Machines Corporation Method for controlling access to information

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020084974A1 (en) * 1997-09-01 2002-07-04 Toshikazu Ohshima Apparatus for presenting mixed reality shared among operators
US6990492B2 (en) * 1998-11-05 2006-01-24 International Business Machines Corporation Method for controlling access to information

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
GERALD BIANCHI ET AL: "High-fidelity visuo-haptic interaction with virtual objects in multi-modal AR systems", MIXED AND AUGMENTED REALITY, 2006. ISMAR 2006. IEEE/ACM INTERNATIONAL SYMPOSIUM ON, IEEE, PI, 1 October 2006 (2006-10-01), pages 187 - 196, XP031014669, ISBN: 978-1-4244-0650-0 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2615580A1 (en) 2012-01-13 2013-07-17 Softkinetic Software Automatic scene calibration
WO2013104800A1 (en) 2012-01-13 2013-07-18 Softkinetic Software Automatic scene calibration

Similar Documents

Publication Publication Date Title
US20100182400A1 (en) Aligning Images
US11928838B2 (en) Calibration system and method to align a 3D virtual scene and a 3D real world for a stereoscopic head-mounted display
US9508146B2 (en) Automated frame of reference calibration for augmented reality
CN102750704B (en) Step-by-step video camera self-calibration method
CN104504694B (en) A kind of method for obtaining moving sphere three-dimensional information
US8786596B2 (en) View point representation for 3-D scenes
US10325410B1 (en) Augmented reality for enhancing sporting events
US20110216090A1 (en) Real-time interactive augmented reality system and method and recording medium storing program for implementing the method
US20100156906A1 (en) Shot generation from previsualization of a physical environment
Prisacariu et al. Real-time 3d tracking and reconstruction on mobile phones
CN105164728A (en) Diminished and mediated reality effects from reconstruction
KR20150013709A (en) A system for mixing or compositing in real-time, computer generated 3d objects and a video feed from a film camera
CN109035327B (en) Panoramic camera attitude estimation method based on deep learning
CN108364319A (en) Scale determines method, apparatus, storage medium and equipment
Chen et al. A real-time markerless augmented reality framework based on SLAM technique
CN110335351A (en) Multi-modal AR processing method, device, system, equipment and readable storage medium storing program for executing
CN112053447A (en) Augmented reality three-dimensional registration method and device
EP3275182B1 (en) Methods and systems for light field augmented reality/virtual reality on mobile devices
US20240078701A1 (en) Location determination and mapping with 3d line junctions
CN111161398A (en) Image generation method, device, equipment and storage medium
Leach et al. Recreating Sheffield’s medieval castle in situ using outdoor augmented reality
Wither et al. Using aerial photographs for improved mobile AR annotation
Baker et al. Splat: Spherical localization and tracking in large spaces
Ning Design and research of motion video image analysis system in sports training
US11373329B2 (en) Method of generating 3-dimensional model data

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 09789428

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 12/01/2012)

122 Ep: pct application non-entry in european phase

Ref document number: 09789428

Country of ref document: EP

Kind code of ref document: A1