10 Hits in 8.5 sec

Towers of Babel: Combining Images, Language, and 3D Geometry for Learning Multimodal Vision [article]

Xiaoshi Wu, Hadar Averbuch-Elor, Jin Sun, Noah Snavely
2021 arXiv   pre-print
WikiScenes forms a new testbed for multimodal reasoning involving images, text, and 3D geometry. We demonstrate the utility of WikiScenes for learning semantic concepts over images and 3D models.  ...  However, a major source of information available for these 3D-augmented collections---namely language, e.g., from image captions---has been virtually untapped.  ...  This work was supported by the National Science Foundation (IIS-2008313), by the generosity of Eric and Wendy Schmidt by recommendation of the Schmidt Futures program, by the Zuckerman STEM leadership  ... 
arXiv:2108.05863v1 fatcat:vwv443xwwbaidmqrpwvogtigzi

VQGAN-CLIP: Open Domain Image Generation and Editing with Natural Language Guidance [article]

Katherine Crowson and Stella Biderman and Daniel Kornis and Dashiell Stander and Eric Hallahan and Louis Castricato and Edward Raff
2022 arXiv   pre-print
We demonstrate a novel methodology for both tasks which is capable of producing images of high visual quality from text prompts of significant semantic complexity without any training by using a multimodal  ...  Generating and editing images from open domain text prompts is a challenging task that heretofore has required expensive and specially trained models.  ...  Acknowledgements We would like to acknowledge Ryan Murdock, who developed a very similar technique for combining VQGAN and CLIP simultaneously to us [30, 31] but did not release his approach.  ... 
arXiv:2204.08583v1 fatcat:tsfe7nozlvgejg2m7lrxlugjta

D3Net: A Unified Speaker-Listener Architecture for 3D Dense Captioning and Visual Grounding [article]

Dave Zhenyu Chen, Qirui Wu, Matthias Nießner, Angel X. Chang
2022 arXiv   pre-print
Despite developments in both areas, the limited amount of available 3D vision-language data causes overfitting issues for 3D visual grounding and 3D dense captioning methods.  ...  Recent studies on dense captioning and visual grounding in 3D have achieved impressive results.  ...  Machine Learning on Static and Dynamic 3D Data Practical.  ... 
arXiv:2112.01551v2 fatcat:7dcyeual3rgbnpv5f7dkak5x6a

TriCoLo: Trimodal Contrastive Loss for Fine-grained Text to Shape Retrieval [article]

Yue Ruan, Han-Hung Lee, Ke Zhang, Angel X. Chang
2022 arXiv   pre-print
On the other hand, work on joint representation learning for 3D shapes and text has thus far mostly focused on improving embeddings through modeling of complex attention between representations , or multi-task  ...  Prior work in 3D and text representations has also focused on bimodal representation learning using either voxels or multi-view images with text.  ...  Acknowledgements This work is funded by the Canada CIFAR AI Chair program and an NSERC Discovery Grant. This research was enabled in part by support provided by WestGrid and Compute Canada.  ... 
arXiv:2201.07366v1 fatcat:ff2frrehjffivbr27tgndoddmu

Immersive VR for scientific visualization: a progress report

R.M. Simpson, J.J. LaViola, D.H. Laidlaw, A.S. Forsberg, A. van Dam
2000 IEEE Computer Graphics and Applications  
G The display geometry of the Virtual Tricorder closely reflects the geometry of the 6DOF Logitech FlyMouse, enhanced with transparent menus.  ...  I mmersive virtual reality (IVR) has the potential to be a powerful tool for the visualization of burgeoning scientific data sets and models.  ...  Some contributed sidebars, others submitted detailed comments and suggested fixes that we have shamelessly but gratefully incorporated.  ... 
doi:10.1109/38.888006 fatcat:emldpdf7w5huhpt7xj4emhjibq

EMCO#5 in one file

Svenn-Arve Myklebost
2018 Early Modern Culture Online  
Among Bulwer's more radical claims are his description of gesture as a natural and universal language, 'spoken' and understood by all people (a pre-Babel form of human expression), in Titus Andronicus  ...  standing on top of a globe; the right-hand image is a moment in a scene from a Shakespeare play, somewhat emblem-like in the manner it combines the visual and the verbal.  ...  Call for contributions As always, we will accept research articles that present original material on early modern topics within the fields of literature, history, art history, philosophy, music and language  ... 
doi:10.15845/emco.v5i0.1516 fatcat:zdgjdxdibrd3rib4crndzvnbea

Calculation of the A term of magnetic circular dichroism based on time dependent-density functional theory I. Formulation and implementation

Michael Seth, Tom Ziegler, Arup Banerjee, Jochen Autschbach, Stan J. A. van Gisbergen, Evert J. Baerends
2004 Journal of Chemical Physics  
Background noise may be still more detrimental in the case of preschool children who speak one language ͑L1͒ at home and who start to learn a second language ͑L2͒ in nursery school.  ...  The effect of noise on novel word learning in sequential bilingual children.  ...  The array is cabled back to the tower for power and signal collection. The tower is microwave-linked to shore for internet-based control and data retrieval.  ... 
doi:10.1063/1.1747828 pmid:15268124 fatcat:hwyfald5a5f4njmw5h7p6vjhnu

Teaching and Learning in the Digital World: Possibilities and Challenges

Jeremy Dubeau, Kevon Licorish, Tim Scobie
2013 unpublished
Acknowledgments This research was supported by grants from the Social Sciences and Humanities Research Council of Canada, the Canadian Council on Learning, and the Fonds de recherche sur la société et  ...  Eric Jackman Institute of Child Study, University of Toronto, Canada, for the insights and opportunities enabled by their involvement.  ...  One assignment, the Wanted Poster, is an opportunity for new teachers to learn basic computer skills for combining images with words and visual design.  ... 

Special Issue: Semantic Informational Technologies

Vladimir Fomichov, Anton Železnikar, Matjaž Gams, Jožef Stefan, Drago Torkar, Jožef Stefan, Editorial Board, Juan Carlos, Augusto, Argentina, Costin Badica, Romania (+20 others)
2010 unpublished
This paper presents an approach in the domain of collaborative systems for working and learning practices called KP-Lab System.  ...  The possibilities of using SK-languages defined by the theory of K-representations for building semantic annotations of informational sources and for constructing semantic representations of discourses  ...  The primary PR anonymisation is done by the Hospital Information System of the University Specialised Hospital for Active Treatment of Endocrinology "Acad. Acknowledgement  ... 

Hand, Haut, haptische Medien

Jana Herwig
2017 unpublished
The hand is thus interpreted as an integrative organ between the spheres of the material and the symbolic, founded by the common segmentation of hand and language.  ...  Subsequently, the relationship of hand, device and gaze is problematized, departing from Martin Heidegger's "Umsicht" (circumspection), which is used for the description of the "situated agents" of artificial  ...  Language as the technology of human extension, whose powers of division and separation we know so well, may have been the "Tower of Babel" by which men sought to scale the highest heavens. 308 Ein von  ... 
doi:10.25365/thesis.48960 fatcat:ppc626tby5dmzkll4g23svktzm