1,248 Hits in 9.9 sec

Predict, Prevent, and Evaluate: Disentangled Text-Driven Image Manipulation Empowered by Pre-Trained Vision-Language Model [article]

Zipeng Xu, Tianwei Lin, Hao Tang, Fu Li, Dongliang He, Nicu Sebe, Radu Timofte, Luc Van Gool, Errui Ding
2022 arXiv   pre-print
., Predict, Prevent, and Evaluate (PPE), for disentangled text-driven image manipulation that requires little manual annotation while being applicable to a wide variety of manipulations.  ...  Our method approaches the targets by deeply exploiting the power of the large-scale pre-trained vision-language model CLIP.  ...  This work was supported by the PRIN project CREATIVE Prot. 2020ZSL9F9, by the EUREGIO project OLIVER and by the EU H2020 AI4Media project under Grant 951911.  ... 
arXiv:2111.13333v2 fatcat:ewsrexmxmbbrthvhrftgm73k2i

Multimodal Image Synthesis and Editing: A Survey [article]

Fangneng Zhan, Yingchen Yu, Rongliang Wu, Jiahui Zhang, Shijian Lu, Lingjie Liu, Adam Kortylewski, Christian Theobalt, Eric Xing
2022 arXiv   pre-print
This is followed by a comprehensive description of benchmark datasets and corresponding evaluation metrics as widely adopted in multimodal image synthesis and editing, as well as detailed comparisons of  ...  vision and deep learning research.  ...  ACKNOWLEDGMENTS This study is supported under the RIE2020 Industry Alignment Fund -Industry Collaboration Projects (IAF-ICP) Funding Initiative, as well as cash and in-kind contribution from the industry  ... 
arXiv:2112.13592v3 fatcat:46twjhz3hbe6rpm33k6ilnisga

A Roadmap for Big Model [article]

Sha Yuan, Hanyu Zhao, Shuai Zhao, Jiahong Leng, Yangxiao Liang, Xiaozhi Wang, Jifan Yu, Xin Lv, Zhou Shao, Jiaao He, Yankai Lin, Xu Han (+88 others)
2022 arXiv   pre-print
We introduce 16 specific BM-related topics in those four parts, they are Data, Knowledge, Computing System, Parallel Training System, Language Model, Vision Model, Multi-modal Model, Theory&Interpretability  ...  , Commonsense Reasoning, Reliability&Security, Governance, Evaluation, Machine Translation, Text Generation, Dialogue and Protein Research.  ...  Vision Language Matching Vision Language Matching (VLM) is similar to the Next Sentence Prediction (NSP) task in NLP, which requires the model to predict whether the image and text are matched.  ... 
arXiv:2203.14101v4 fatcat:rdikzudoezak5b36cf6hhne5u4

State-of-the-Art in the Architecture, Methods and Applications of StyleGAN [article]

Amit H. Bermano and Rinon Gal and Yuval Alaluf and Ron Mokady and Yotam Nitzan and Omer Tov and Or Patashnik and Daniel Cohen-Or
2022 arXiv   pre-print
However, the control offered by StyleGAN is inherently limited to the generator's learned distribution, and can only be applied to images generated by StyleGAN itself.  ...  Despite being learned with no supervision, it is surprisingly well-behaved and remarkably disentangled.  ...  Kumari et al. [2021] propose to leverage the feature space of pre-trained vision models, trained for different vision tasks.  ... 
arXiv:2202.14020v1 fatcat:qu3plbdnszdujcwxwq3zizysje

On the Opportunities and Risks of Foundation Models [article]

Rishi Bommasani, Drew A. Hudson, Ehsan Adeli, Russ Altman, Simran Arora, Sydney von Arx, Michael S. Bernstein, Jeannette Bohg, Antoine Bosselut, Emma Brunskill, Erik Brynjolfsson, Shyamal Buch (+102 others)
2022 arXiv   pre-print
This report provides a thorough account of the opportunities and risks of foundation models, ranging from their capabilities (e.g., language, vision, robotics, reasoning, human interaction) and technical  ...  principles(e.g., model architectures, training procedures, data, systems, security, evaluation, theory) to their applications (e.g., law, healthcare, education) and societal impact (e.g., inequity, misuse  ...  This report provides a thorough account of the opportunities and risks of foundation models, ranging from their capabilities (e.g., language, vision, robotic manipulation, reasoning, human interaction)  ... 
arXiv:2108.07258v3 fatcat:kohwrwk2ybf7fd7wsuz2gp65ki

A comprehensive survey on semantic facial attribute editing using generative adversarial networks [article]

Ahmad Nickabadi, Maryam Saeedi Fard, Nastaran Moradzadeh Farid, Najmeh Mohammadbagheri
2022 arXiv   pre-print
The requested modifications are provided as an attribute vector or in the form of driving face image and the whole process is performed by the corresponding models.  ...  Among different domains, face photos have received a great deal of attention and a large number of face generation and manipulation models have been proposed.  ...  By incorporation of a pre-trained proxy network and a feature threshold loss, the structure and texture are enforced to be disentangled in the latent space.  ... 
arXiv:2205.10587v1 fatcat:thpe4crcgndifb5mhtuveww4ji

Knowledge Augmented Machine Learning with Applications in Autonomous Driving: A Survey [article]

Julian Wörmann, Daniel Bogdoll, Etienne Bührle, Han Chen, Evaristus Fuh Chuo, Kostadin Cvejoski, Ludger van Elst, Tobias Gleißner, Philip Gottschall, Stefan Griesche, Christian Hellert, Christian Hesels (+34 others)
2022 arXiv   pre-print
Leveraging additional, already existing sources of knowledge is key to overcome the limitations of purely data-driven approaches, and eventually to increase the generalization capability of these models  ...  However, the subsequent application of these models often involves scenarios that are inadequately represented in the data used for training.  ...  Natural Language Models, such as proposed in [164] , [563] , [78] are capable of directly converting natural language text, e.g., common sense text like Wikipedia articles or textual rules like road  ... 
arXiv:2205.04712v1 fatcat:u2bgxr2ctnfdjcdbruzrtjwot4

State of the Art on Neural Rendering [article]

Ayush Tewari, Ohad Fried, Justus Thies, Vincent Sitzmann, Stephen Lombardi, Kalyan Sunkavalli, Ricardo Martin-Brualla, Tomas Simon, Jason Saragih, Matthias Nießner, Rohit Pandey, Sean Fanello (+7 others)
2020 arXiv   pre-print
Concurrently, progress in computer vision and machine learning have given rise to a new approach to image synthesis and editing, namely deep generative models.  ...  into network training.  ...  by a pre-trained image classifier F (e.g., VGG network [SZ15] ).  ... 
arXiv:2004.03805v1 fatcat:6qs7ddftkfbotdlfd4ks7llovq

A 20-Year Community Roadmap for Artificial Intelligence Research in the US [article]

Yolanda Gil, Bart Selman
2019 arXiv   pre-print
AI systems can now translate across multiple languages, identify objects in images and video, streamline manufacturing processes, and control cars.  ...  Achieving the full potential of AI technologies poses research challenges that require a radical transformation of the AI research enterprise, facilitated by significant and sustained investment.  ...  language descriptions based on first-person view and grounding language to objects using well-trained computer vision models).  ... 
arXiv:1908.02624v1 fatcat:jza6i2tzufgeracsou77qukbu4

A Survey on Intrinsic Images: Delving Deep Into Lambert and Beyond [article]

Elena Garces, Carlos Rodriguez-Pardo, Dan Casas, Jorge Lopez-Moreno
2021 arXiv   pre-print
a shading, produced by the interaction between light and geometry.  ...  In this survey, we overview those results in context of well-known intrinsic image data sets and relevant metrics used in the literature, discussing their suitability to predict a desirable intrinsic image  ...  Acknowledgments Elena Garces was partially supported by a Torres Quevedo Fellowship (PTQ2018-009868).  ... 
arXiv:2112.03842v1 fatcat:ciwwxoodq5fl7ma4jqjrgp7k5m

Deep Reinforcement Learning [article]

Yuxi Li
2018 arXiv   pre-print
After that, we discuss RL applications, including games, robotics, natural language processing (NLP), computer vision, finance, business management, healthcare, education, energy, transportation, computer  ...  Next we discuss RL core elements, including value function, policy, reward, model, exploration vs. exploitation, and representation.  ...  Lanctot et al. (2017) observe that independent RL, in which each agent learns by interacting with the environment, oblivious to other agents, can overfit the learned policies to other agents' policies  ... 
arXiv:1810.06339v1 fatcat:kp7atz5pdbeqta352e6b3nmuhy

Harnessing value from data science in business: ensuring explainability and fairness of solutions [article]

Krzysztof Chomiak, Michał Miktus
2021 arXiv   pre-print
For fairness, the authors discuss the bias-inducing specifics, as well as relevant mitigation methods, concluding with a set of recipes for introducing fairness in data-driven organizations.  ...  The paper introduces concepts of fairness and explainability (XAI) in artificial intelligence, oriented to solve a sophisticated business problems.  ...  Ante-hoc explainability The behaviour of a plethora of AI models is, to a large extent, driven by datasets employed in their training stage.  ... 
arXiv:2108.07714v1 fatcat:s36ftwpzyvbaxnawhtcdtpyobe

A survey on data‐efficient algorithms in big data era

Amina Adadi
2021 Journal of Big Data  
less training data and in particular less human supervision.  ...  This has triggered a serious debate in both the industrial and academic communities calling for more data-efficient models that harness the power of artificial learners while achieving good results with  ...  one style to another) [159] , Text-to-Image Translation [160] , Audio-to-Image Generation [161] , Text-to-Speech synthesis [162] … etc.  ... 
doi:10.1186/s40537-021-00419-9 fatcat:v4uahsvhlzdldlxqf24bshmja4

Artificial Intelligence: Research Impact on Key Industries; the Upper-Rhine Artificial Intelligence Symposium (UR-AI 2020) [article]

Andreas Christ, Franz Quint
2020 arXiv   pre-print
) and the University of Applied Sciences and Arts Northwestern Switzerland.  ...  The alliance's common goal is to reinforce the transfer of knowledge, research, and technology, as well as the cross-border mobility of students.  ...  Main sponsor: esentri AG, Ettlingen Acknowledgement This research and development project is funded by the German Federal Ministry of Education and Research (BMBF) and the European Social Fund (ESF)  ... 
arXiv:2010.16241v1 fatcat:y6lc2dmlyvh55bw2ytfbf7hwta

Basque Conference on Cyber Physical Systems and Artificial Intelligence

Manuel Graña
2022 Zenodo  
This entry contains the proceedings of the Basque Conference on Cyber Physical Systems and Artificial Intelligence  ...  Under FONDEF project title: "Advanced Data Science Methods for medication error prevention", project CODE: ID20I10001.  ...  Acknowledgements This work was supported by the CybSPEED H2020-MSCA-RISE-2017 grant 777720 and by European Regional Development Fund within the OP "Science and Education for Smart Growth 2014 -2020", Project  ... 
doi:10.5281/zenodo.6568145 fatcat:lbwc7clohbeybmmdjhdlalxziu
« Previous Showing results 1 — 15 out of 1,248 results