Filters








494 Hits in 3.6 sec

Reconfigurable Parallel Multi-way Associative Cache with Miss-fetch Merge for Anisotropic Texture Filtering

Youngsik Kim
2015 International Journal of Multimedia and Ubiquitous Engineering  
Anisotropic texture filtering has been developed for a high quality three dimensional (3D) computer graphics and multimedia applications.  ...  This paper proposes an effective parallel multi-way set associative cache for anisotropic texture filtering, which can be adaptively reconfigured based on the number of probe samples.  ...  The benchmark suites for the simulation use OpenGL library as 3D graphics API in Table 2, Table 3 , and Figure 6 .  ... 
doi:10.14257/ijmue.2015.10.8.39 fatcat:t7ncxcnmgzgjfktphvynomsy4e

A 36fps SXGA 3D Display Processor with a Programmable 3D Graphics Rendering Engine

Seok-Hoon Kim, Jae-Sung Yoon, Chang-Hyo Yu, Donghyun Kim, Kyusik Chung, Han Shin Lim, HyunWook Park, Lee-Sup Kim
2007 Digest of technical papers / IEEE International Solid-State Circuits Conference  
A 361ps SXGA 3D Display Processor with a To solve this problem, the SE processes the interpolation and the Programmable 3D Graphics Rendering Engine multiplexing operation line by line.  ...  The multiplexing unit loads and multiplexes these lines There has been tremendous progress in 3D graphics hardware for into three lines of an output image.  ...  A texture cache is adopted to [7] C. Van Berkel, "Image Preparation for 3D-LCD," Proc.  ... 
doi:10.1109/isscc.2007.373401 dblp:conf/isscc/KimYYKCLPK07 fatcat:mew22hpzhzfszk46ludccwpd6m

Accelerating image recognition on mobile devices using GPGPU

Miguel Bordallo López, Henri Nykänen, Jari Hannuksela, Olli Silvén, Markku Vehviläinen, John D. Owens, I-Jong Lin, Yu-Jin Zhang, Giordano B. Beretta
2011 Parallel Processing for Imaging Applications  
We have implemented a series of image processing techniques in the shader language of OpenGL ES 2.0, compiled them for a mobile graphics processing unit and performed tests on a mobile application processor  ...  The use of Graphic Processing Units for computing is very well suited for parallel processing and the addition of programmable stages and high precision arithmetic provide for opportunities to implement  ...  We would also like to thank the Texas Instruments University Program for the donation of the research equipment.  ... 
doi:10.1117/12.872860 dblp:conf/ppia/LopezNHSV11 fatcat:atn3p4fazzgqflfsncib6pksva

Practical Line Rasterization for Multi-resolution Textures [article]

Javier Taibo, Alberto Jaspe, Antonio Seoane, Marco Agus, Luis Hernández
2014 Smart Tools and Applications in Graphics  
Draping 2D vectorial information over a 3D terrain elevation model is usually performed by real-time rendering to texture.  ...  In this paper, we address the problems of 2D line rasterization on a multi-resolution texturing engine from a pragmatical point of view; some alternative solutions are presented, compared and evaluated  ...  This work is partially supported by the EU FP7 Program under the DIVA (290277) project.  ... 
doi:10.2312/stag.20141234 dblp:conf/egItaly/TaiboJSAH14 fatcat:pp67i33hm5hzhnzm3tiu256fte

Pomegranate

Matthew Eldridge, Homan Igehy, Pat Hanrahan
2000 Proceedings of the 27th annual conference on Computer graphics and interactive techniques - SIGGRAPH '00  
Pomegranate's scalability is achieved with a novel "sorteverywhere" architecture that distributes work in a balanced fashion at every stage of the pipeline, keeping the amount of work performed by each  ...  Because of the balanced distribution, a scalable network based on high-speed point-to-point links can be used for communicating between the pipelines.  ...  We would like to thank Greg Humphreys, John Owens and the rest of the Stanford Graphics Lab for their reviews of this paper and their insights.  ... 
doi:10.1145/344779.344981 dblp:conf/siggraph/EldridgeIH00 fatcat:p4ogzeh7mzeudhwmjor4mln3my

Texture Caches

M. Doggett
2012 IEEE Micro  
Most GPUs have multiple texture filtering units running in parallel, and the texture cache must supply these with texels.  ...  The texture cache changed into a two level cache as seen in the NVIDIA G80 architecture [14] . The G80 had an L2 cache distributed to each DRAM channel.  ... 
doi:10.1109/mm.2012.44 fatcat:foxaab2tcvcddp5hvwrn4z2qzq

Chromium Renderserver: Scalable and Open Remote Rendering Infrastructure

B. Paul, S. Ahern, E.W. Bethel, E. Brugger, R. Cook, J. Daniel, K. Lewis, J. Owen, D. Southard
2008 IEEE Transactions on Visualization and Computer Graphics  
The new contributions of this work include a solution to the problem of synchronizing X11 and OpenGL command streams, remote delivery of parallel hardware-accelerated rendering, and a performance analysis  ...  Chromium Renderserver (CRRS) is software infrastructure that provides the ability for one or more users to run and view image output from unmodified, interactive OpenGL and X11 applications on a remote  ...  Chromium is a "drop-in" replacement for OpenGL that provides the ability for any OpenGL application to run on a parallel system equipped with graphics hardware, including distribute memory clusters [7  ... 
doi:10.1109/tvcg.2007.70631 pmid:18369269 fatcat:rtdccu3g4za2vfxg3b47pyvh6q

Fast OBJ file importing and parsing in CUDA

Aidan L. Possemiers, Ickjai Lee
2015 Computational Visual Media  
Alias -Wavefront OBJ meshes are a common text file type for transferring 3D mesh data between applications made by different vendors.  ...  These results demonstrate that the time is right for further research into the use of data-parallel GPU acceleration beyond that of computer graphics and high performance computing.  ...  Unlike PLY or STL, OBJ files store 3D mesh data as a series of single line elements prefixed by a character sequence: "#" for human readable commenting; "v" for vertex coordinates; "vt" for texture coordinates  ... 
doi:10.1007/s41095-015-0021-5 fatcat:pcvdexgwvzeb5ooqbn72ftaa5y

The GPU Computing Era

John Nickolls, William J Dally
2010 IEEE Micro  
Acknowledgments We thank Jen-Hsun Huang of NVIDIA for his Hot Chips 21 keynote 23 that inspired this article, and the entire NVIDIA team that brings GPU computing to market.  ...  In 1997, NVIDIA released the RIVA 128 3D singlechip graphics accelerator for games and 3D visualization applications, programmed with Microsoft Direct3D and OpenGL.  ...  By using a latencyoptimized CPU to run the code's serial fraction, it gives the best possible performance on the serial fraction-which is important even for mostly parallel codes.  ... 
doi:10.1109/mm.2010.41 fatcat:tmcgmo7v5zasbpakpqk37anni4

OO-VR

Chenhao Xie, Fu Xin, Mingsong Chen, Shuaiwen Leon Song
2019 Proceedings of the 46th International Symposium on Computer Architecture - ISCA '19  
With the strong computation capability, NUMA-based multi-GPU system is a promising candidate to provide sustainable and scalable performance for Virtual Reality.  ...  By conducting comprehensive characterizations on different kinds of parallel rendering frameworks, we observe that distributing the rendering object along with its required data per GPM can reduce the  ...  Each SM 1 is composed of a unified texture/L1 cache (TX/L1 $), several texture units (TXU) and hundreds of shader cores that execute a variety of graphics shaders (e.g., the functions in both geometry  ... 
doi:10.1145/3307650.3322247 dblp:conf/isca/XieFCS19 fatcat:yhmmnagmwfaknjaw2mgbnidoia

Eliminating redundant fragment shader executions on a mobile GPU via hardware memoization

Jose-Maria Arnau, Joan-Manuel Parcerisa, Polychronis Xekalakis
2014 2014 ACM/IEEE 41st International Symposium on Computer Architecture (ISCA)  
We have measured this fragment redundancy for a set of commercial Android applications, and found that more than 40% of the fragments used in a frame have been already computed in a prior frame.  ...  In terms of rendering these images, this behavior translates into the creation of many fragment programs with the exact same input data.  ...  The GPU memory hierarchy includes several first level caches employed for storing geometry (Vertex and Tile Caches) and textures (Texture Cache), and are connected through a shared bus to the L2 cache.  ... 
doi:10.1109/isca.2014.6853207 dblp:conf/isca/ArnauPX14 fatcat:347r3f4ttjbibku4i46kmdn6aq

Eliminating redundant fragment shader executions on a mobile GPU via hardware memoization

Jose-Maria Arnau, Joan-Manuel Parcerisa, Polychronis Xekalakis
2014 SIGARCH Computer Architecture News  
We have measured this fragment redundancy for a set of commercial Android applications, and found that more than 40% of the fragments used in a frame have been already computed in a prior frame.  ...  In terms of rendering these images, this behavior translates into the creation of many fragment programs with the exact same input data.  ...  The GPU memory hierarchy includes several first level caches employed for storing geometry (Vertex and Tile Caches) and textures (Texture Cache), and are connected through a shared bus to the L2 cache.  ... 
doi:10.1145/2678373.2665748 fatcat:zc6qezxfo5dbdbln3rgmgy6v3u

Neither more nor less: optimizing thread-level parallelism for GPGPUs

Jose-Maria Arnau, Joan-Manuel Parcerisa, Polychronis Xekalakis
2013 Proceedings of the 22nd International Conference on Parallel Architectures and Compilation Techniques  
Although consecutive frames tend to operate on the same textures, their re-use distances are so big that to the caches fetching textures appears to be a streaming operation.  ...  Since the physics part of the rendering has to be computed sequentially for two consecutive frames, this naturally leads to an increase in the input delay latency for PFR compared with traditional systems  ...  R-PFR achieves its best results for games with a small number of user inputs, such as angrybirds or badpiggies (see Figure 8) , where parallel processing is enabled most of the time.  ... 
doi:10.1109/pact.2013.6618806 dblp:conf/IEEEpact/ArnauPX13 fatcat:c6tyy5vi7bdi3eottnmtk5uily

Avionics Graphics Hardware Performance Prediction with Machine Learning

Simon R. Girard, Vincent Legault, Guy Bois, Jean-François Boland
2019 Scientific Programming  
First, we create nonparametric models of the underlying hardware, with machine learning, by analyzing the instantaneous frames per second (FPS) of the rendering of a synthetic 3D scene and by drawing multiple  ...  As proven by previous hardware emulation tools, there is also a potential for development cost reduction, by enabling developers to have a first estimation of the performance of its graphical engine early  ...  Acknowledgments Special thanks are due to CAE Inc. for providing experimental material, industrial 3D scenes from the World CDB.  ... 
doi:10.1155/2019/9195845 fatcat:57utt7b7wbaynabymtn2qry2n4

Vortex: Extending the RISC-V ISA for GPGPU and 3D-GraphicsResearch [article]

Blaise Tine, Fares Elsabbagh, Krishna Yalamarthy, Hyesoon Kim
2021 arXiv   pre-print
The main goal of the ISA extension proposal is to minimize the ISA changes so that the corresponding changes to the open-source ecosystem are also minimal, which makes for a sustainable development ecosystem  ...  We argue that one of the reasons for the lack of open-source infrastructure for GPUs is rooted in the complexity of their ISA and software stacks.In this work, we first propose an ISA extension to RISC-V  ...  We also thank HPArch group members, Jeff Young, Seyong Lee, Jeff Vetter, Chad Kersey, and the anonymous reviewers for their feedback on improving the paper.  ... 
arXiv:2110.10857v1 fatcat:bxjizz5hx5dzrft4qb4nhkbdqa
« Previous Showing results 1 — 15 out of 494 results