[VEE] A Lightweight VMM on Many Core for High Performance Computing

티스토리 뷰

논문 요약

[VEE] A Lightweight VMM on Many Core for High Performance Computing

덕쑤 2013. 3. 28. 15:23

이논문을 읽은 이유는 가상화 기술은 서버에서 더 일찍 활용되어 왔기 때문에 서버에서의 가상화 발전 흐름을 보고 동일하게 임베디드 시스템에서 적용 가능한 아이디어 있을지도 모른다고

VEE 2013

A Lightweight VMM on Many Core for High Performance Computing Xi’an Jiaotong University, China

adstract

전통적인 VMM 디바이스나 instruction을 가상화한다. 이는 게스트 운영체제의 오버해드를 유도한다. 게다가 가상화 VMM은 많은 양의 코드를 야기한다. 이는 VMM의 버그나 취약점을 유발한다. // VMM의 많은양의 코드로 인한문제 발생

다른 한편으로는 클라우드 컴퓨터에서 클라우드 서비스 제공업체는 사전에 고객 지정 요구 사항에 따라 가상머신을 구성합니다. 향후 멀티 코어 서버가 충분히 증가하면 이는 클아우드 컴퓨팅에 대한 편의를 제공하지만 가상화가 필요하지 않습니다.

위의 관찰 에 근거하여 이 논문은 VMM을 구축하기 위한 다른 방법을 제시한다. configuring a booting interface instead of virtualization technology. A lightweight virtual machine monitor - OSV is proposed based on this idea. OSV can host multiple full functional Linux kernels with little performance overhead. There are only 6 hyper-calls in OSV. The Linux running on top of OSV is intercepted only for the inter-processor interrupts. 자원의 격리는 하드웨어 보조 가상화로 구현한다. 자원 공유는 현쟈 운영체제에 포함된 분산 프로토콜에 의해 제어된다.

We implement a prototype of OSV on AMD Opteron processor based 32-core servers with SVM and cache-coherent NUMA architectures. OSV can host up to 8 Linux kernels on the server with less than 10 lines of code modifications to Linux kernel.OSV has about 8000 lines of code which can be easily tuned and debugged. The experiment results show that OSV VMM has 23.7% performance improvement compared with Xen VMM. // 결론은 하드웨어의 가상화 지원을 이용하여 VMM의 크기를 줄여서 성능향상의 효과를 노린다.

아 지랄 맞다 1 ~ 3.3 까지 번역한거 날라 갔다 니미럴

3.4 Virtual Network Inter face

가상머신간 서로 의사 소통을 할 수 있게 하기 위하여, VMM은 virtual network interfaces를 제공해야한다. 가상머신들은 서로 표준 분산 프로토콜 를 사용하여 커뮤니케이션을 할수 있다 가상 네트워크 인터페이스 안에서 예를 들어 가상 머신들은 파일을 공유한다 NFS 가상 네트워크 인터페이스를 통하여 기존 분산처리 프롵콜을 사용하는 어플리케이션은 OSV에서 수정없이 사용이 가능하다. This can keep compatible with current computing base.

There is a virtual network interface card(VNIC) for each virtual domain. Each VNIC has a private memory queue for receiving data packets. When the network code of the operating system submits a packet to the VNIC, the packet will be forwarded to the corresponding memory queue of the VNIC whose hardware address matches the packet’s destination. When receiving a packet, the VNIC passes the packet to the network stack code. These memory operations for processing the packet are all done by the operating system without the interception of the VMM which can increase the performance of the network stack. VNIC is provided as a driver module which can be installed in Linux without modifying Linux’s source code.

// VNIC는 모듈러 프로그래밍으로 하여 리눅스 수정이 필요없고 가상 네트워크로 가상머신 끼리 통신 하므로 VMM이 개입할 필요하 없다.

3.5 Inter-OS Communication Socket

가상 네트워크 인터페이스를 통한 OS간의 통신 성능은 TCP/IP 네트워크 스택에 의해 제한된다. 성능을 향상시키기 위해 OSV는 UNIX IPC 와 닮은 소켓을 제공한다. The API is the same as UNIX IPC socket. When creating a socket, one just needs to specify the socket type to OSV type. Then traditional socket functions such as listen, accept, send and so on are used to send and receive data. The data transmission mechanism is similar to virtual network interface : 각 가상 머신에 메모리 대기열이 있으며, 각 데이터 패킷이 대상 가상 머신 ID 및 포트에 적절한 대기열에 배치됩니다. The memory operations are like the virtual network interface all done by operating systems which need not trap into the VMM. As VNIC mentioned in 3.4, OSV socket is also a driver module installed in guest OS. Extra head files for C programmes are also provided. Applications with source code can be easily ported to this socket. OSV에 소켓 유형을 변경하고 OSV 정의 주소 공간를 사용하여 응용 프로그램 OSV 소켓을 통해 게스트 통신 할 수 있습니다. This is helpful for applications which are sensitive to the communication bandwidth. // OSV socet 이 것은 rptmxm OS간의 통신 속도를 높이기 위해서 TCP/IP 대신에 따로 통전용 로직을 만든다는 이야기다 이것도 모듈이다

3.6 Hypervisor Call

OSV 커널의 설계 원리는 VMM은 운영 체제에 의해 액세스 자원 범위를 제한하고 리소스 사용의 결정을 할 수있는 운영 체제가 할 수 있게 하는것입니다. This principle reduces the interface complexity between operating system and VMM. In OSV VMM, it has only two types of hyper-call: resource access restrict functions and communication channel construction The resource restrict functions 자원의 독립적으로 할당하는것과 가상머신들간의 공유 디바이스 접속의 동기화에 사용합니다. The resources are such as I/O devices, Local APIC, I/O APIC and so on. The later type of functions are used by the communication between operating systems. When two operating systems need to communicate with other, one operating system calls the function with the arguments of physical address of allocated memory and the two virtual machines ID pair, then the other operating system traps into the virtual machine to get the memory address with the ID pair. The shared memory and ID pair compose the communication channel. functions. The memory queues for virtual network interface and inter-OS communication socket are constructed using these functions. VMM에 트랩은 필요한 경우 통신의 성능을 개선하고 또한 OSV의 VMM의 복잡성을 줄일 수 있다. // 두가지 유형이 있다 자원의 독립성을 위한거랑 가상 네트워크 인터페이스 를 지원하기 위한거 두개가 있다.

3.7 Virtual Machine Construction

A virtual machine instance contains physical processor cores and main memory.Each virtual machine has statically allocated amounts of processor cores and main memory which can not be changed when constructed. The resources are allocated based on their NUMA node affinity. 동일한 NUMA 도메인에 있는 프로세서와 메모리를 할당하면 메모리 엑세스 시간이 단축된다. The I/O devices are only allocated to the privileged domain. Other domains can access the services provided by these devices through standard distributed services exported by the privileged domain. The OS image and its drivers are loaded into main memory when the system is booting. The resources needed by a domain are preallocated before they are booted. A domain is created when the system load is high. The virtual machine instance is created by issuing a hypervisor call inprivileged domain. When the virtual machine is booted, the tasks can be submitted to it using standard distributed protocols.

3.8 Isolation Between guests

CPU cores and memory are preallocated to a guest OS. Guest OS initialises its internal structures based on these resources. So resources out of this range will not be accessed by the guest OS. OSV는 디바이스와 인터럽트를 컨트롤 해야한다 게스트 OS간 간섭을 없에기 위해서 For devices as described in previous section 3.3 all devices are allocated to a privileged operating system. So, interrupts from devices are not intercepted by OSV. 해야 할 유일한 것은 OSV는 재부팅하거나 재설정 작업과 같은 오류 상태를 방지하기 위해 포트 작업을 제어하는 것이다. If a guest OS attempts to write a port, it will trap into OSV. OSV analyzes this operation to detect abnormal states, such as reboot operation, etc. If a reboot operation is found, OSV only reboots the corresponding guest OS not the computer. Operations which cause the server crash are denied by OSV. // 일반 오퍼렌든인지 이게 리부트 오퍼렌드인지 판별해서 리부트면 게스트만 리부트되게 해준다 전체 컴퓨터 안꺼지게

IPI is a special interrupt used by operating system. OSV intercepts IPI to avoid a guest OS sending IPI to other guests. A misssending IPI will corrupt other guests. Based on these mechanisms, although there is no virtualization layer in OSV, guest OS can be well isolated from each other in OSV.

3.9 Application Program Framework

A many core server running with OSV VMM is more like a distributed system. A map-reduce programming framework is provided in OSV.Applications written with the programming framework provided by the OSV are distributed among the OSes. The applications are submitted in the manager OS, and the results are also resembled in it. The framework is implemented on top of the OSV socket which can provides good performance.

4. Implementation

OSV VMM is implemented as a multithreaded program running on multi-core processor based servers. OSV differs from existing systems in that it pays more attention to the resources isolation rather than virtualize these resources. For example, OSV does not contain any structure to virtual the processor cores and main memory.The code size for OSV is about 8,000 lines of code which make it easier for tuning. The current implementation is based on AMD Opteron processors with SVM technology support and can run multiple Linux operating systems concurrently. The table 1 lists the approaches used by OSV to host multiple operating systems.

4.1 Multi-Processor Support

In order to support commodity SMP operating systems, traditional VMM needs a virtual CPU struct to provide IPI and schedules the mapping for the virtual CPUs and physical cores. In OSV, operating

systems use physical CPUs directly. There are two challenges: // Multi processor boot, Inter-Processor Interrupt 이 두가지가 해결해야 하는 문제이다.

Multi-processor boot :

x86 based multi-processor systems are booted using universal SMP boot protocol. 이 프로토콜은 두개의 초기화 IPI를 processor core에 전달한다. 초기화 IPI는 프로세서의 리셋 그리고 특정 위치로 점프하여 실행하게 한다. 재설정 작업은 프로세서가 모든 상태가 OSV에 의해 초기화 레지스터를 클리어 해야한다. This will make OSV lose the control of CPU cores. // 리부팅할때 생기는 IPI 를 잡아내서 OSV 가 처리 하게 해여 전체 코어가 리부팅되는걸 막아야 한다.

Inter-Processor Interrupt :

전통 운영체제는 IPI 가 필요하다 시스템의 상태를 동기화하고 CPU작업을 명세하기 위해서 일부 IPIs은 IPI의 반응을 다른 운영 체제에서 동일한 논리 ID로 CPU 코어의 원인이 논리적 대상 모드에서 전송됩니다. This will lead the system to a unstable state. For the first one 우리는 초기화 보안 처리 인터럽트를 OSV VMM애서 캐치할것이다. 이 방법으로 ㅍ로셋 코어가 재설정되는 것을 막을수 있다. OSV는 INIT IPI를 잡으면, OSV는 16 비트 모드로 게스트 CPU 모드를 initialises과 CPU를위한 시작 코드 IP 주소를 얻을 수 초기화 인터럽트 한 후 실행하도록하고 VMM에서 반환 할 때이 명령에 CPU를 연결합니다. The CPU will do as the traditional multiboot protocol does except for the reset action. The start code ip address is stored by Linux kernel which is 0x467. This address is specified by multiboot specification. Current implementation of OSV is some tricky and hardly dependable for the AMD processors. Because init IPI redirection is based on the AMD’s SVM technology. Modifying the source code of the OS is another way to support the multi-processor booting. The SMP-booting code is not so complex and irrelative for the performance of OS. For example, the SMP-booting code for Linux is all located in a C file. The modifications to the source code is less than 10 lines of c code, including init IPI sending functions and some port operations. This can make the VMM more portable.

4.2 OSV Socket // IPC 내용

In cloud computing, operating systems running on a VMM communicate with each other frequently. They can communicate with each other through traditional network. In OSV, a socket is provided for improve the communication performance between operating systems running on OSV. The socket is UNIX IPC like. The data transfer between operating systems can be done without the intervention of the OSV kernel. This is about the implementation of the osv-protocol. This protocol is based on shared memory and implements the Berkeley Sockets interface. 프로그래머는 정상적인 inet 프로토콜을 사용할 수 있습니다. 각 도메인의 커널에 주소에 대한 문제는 IP로 해결할수 있고 그리고 각 도메인의 각 프로그램은 엑세스 포인트, 포트번호에의해 구분된다. 그림4는 OSV 소켓의 전체 구조를 보여준다. 이것들은 오직 하나의 수신 버퍼를 가진다 각 도메인당 정지적으로 이 큐를 확인한다. ( 먼가 들어 왔냐? 안들어 왔는지 확인한다. ) In order to make domains be able to put message into each other’s buffer, the kernel allocates the receive buffer in initialization and returns each buffer’s physical address. For a socket, it has a buffer list to store the messages received but not handled. One socket is corresponding to one access point, distinguished by a unique integer.

There are two types of communication in osv protocol:

Inter-process communicate in a domain:

발신자와 수신기 주소, 발신자와 수신기 포트 번호, 메시지 길이, 메시지 데이터, 메시지 유형을 포함 osv 메시지 데이터 구조로 전송 절차 패키지 메시지가 표시됩니다. 그리고 수신기의 skb 목록에 OSV skb, 삽입을 구성합니다.When receiver revokes a receive procedure, it will get this message form it’s skblist.

Inter-process data transfer across domains:

프로토콜 초기화하는 동안 각 도메인은 다른 도메인의 물리적 수신 버퍼 주소에 대한 정보를 확인할 수 있습니다. 그리고, 그들은 쉽게 서로의 수신 버퍼에 데이터를 넣을 수 있습니다. 도메인의 IPC처럼, 전송이 먼저 메시지를 패키지. 그 후 수신 IP 주소에 따라 해당 도메인의 수신 버퍼에 데이터를 저장합니다. 각 도메인은 자체 수신 버퍼를 스캔 작업 대기열을 가지고 것처럼,the message can be discovered after being put into buffer immediately.And then, the message data is picked up from osv message data structure, packaged into a osv skb, inserted into the receiver’s skb list. The remainder things will be done like IPC in a domain.

4.3 Linux Kernel

In order to host multiple Linux kernels concurrently, 그러기 위해서 리눅스 소스 코드를 수정해야만한다. 도메인 0 수정 사항과 도메인 x 수정사항은 다르다 :

Domain 0 : 도메인 0는 외부 타이머를 다른 도메인들에게 전달할 필요가 있다. 그래서 도메인 0는 타이머 인터럽트 기능으로 IPI를 사용해서 다른 도메인 들에게 전달해야 한다. ㅇ;것은 하이퍼콜 irq0_forward().로 해결 했다. 이코드 5줄 정도 추가 가능하다.

Domain X : The external timer interrupt for normal domains is received through IPI, so the timer interrupt should issue an EOI to the APIC of the CPU. So, the function of ack APIC irq() should be called in the timer interrupt function. Linux kernel assign each online CPU a logical id. When running multiple Linux, the logical id may be confused while sending IPIs. So assign of logical id to APIC should be intercepted by OSV kernel.And also, the action for sending IPI needs to be intercepted. The work is done by intercepting the writes to APIC registers. When the linux kernel writes the APIC register, it traps to the OSV kernel, the OSV translates the logical id to physical id and then send the IPI using the physical id.

All the modifications of Linux kernel is summarized in table 2. Total lines of code are less than 10, which is easily ported for linux.

5. Ecaluation

This section demonstrates that the OSV’s approach is beneficial to operating systems’ performance. Performance of OSV is evaluated in this section.We first measure the overhead of virtualization using a set of operating system benchmarks. The performance is compared with Xen and native Linux kernel. Then the performance for the OSV’s network system is measured. Finally, memcached is used to show the overall performance of OSV. The experiments were performed on two servers: // memcached, network system

5.1 Operating System Benchmarks

OSV의 VMM의 성능 오버 헤드를 측정하기 위해, 우리는 특정 서브 시스템을 대상으로 몇 가지 실험을 수행했습니다 lmbench(성능 측정 툴 1996년) 벤치 마크는 오버 헤드를 측정하는 데 사용됩니다. We compared the performance of Linux on bare metal, XEN HVM and PVM. The configuration for XEN guest OS is binded to a NUMA node: Guest OS can only access the drams and cpu cores on the specified node. This can reduce the schedule overhead of XEN and avoid cross NUMA node memory accesses. This expriment is carried on the 32-core server. For native Linux, lmbench is not configured to bind to a NUMA node.

그림 5 local communication latency 를 보여준다. XEN HVM이 OSV와 비슷한 성능을 보이고 다른것보다 나은성능을 가진다. This is mainly caused by the NUMA architecture of multi-core server local access time is smaller than remote access. The resources used by the OSV kernel and XEN HVM is bound to a NUMA node so the cpu cores only have local dram accesses which make the performance better than Linux.

그림 6은 kernel gets a high bandwidth compared to Linux and XEN HVM guest, especially in mem read test. The guest OS in OSV VMM accesses the resources allocated to it without the intervention of OSV VMM which makes it lower performance overhead. For the remote dram and cache coherent latencies, the Linux’s bandwidth is the lowest with large number of cpu cores, 32-core in this experiment. In the File reread test, the OSV domain1 has the highest bandwidth. This is caused by the NFS based file system. When a file has already been accessed, it will remain in the NFS client cache, which

A Lightweight VMM on Many Core for High Performance Computing.pdf

'논문 요약' 카테고리의 다른 글

Partitioned Embedded Architecture based on Hypervisor: the XtratuM approach (0)	2013.07.17
XtratuM for LEON3: an Open Source Hypervisor for High Integrity System (0)	2013.07.15
Xtratum : a Hypervisor for Safety Critical Embedded Systems (0)	2013.07.11
Operating System Support for Virtual Machines (0)	2013.07.05
Advanced Size Optimization of the Linux Kernel (0)	2013.07.03

공지사항

최근에 올라온 글

최근에 달린 댓글

Total

Today

Yesterday

링크

TAG more

« 2025/03 »
일	월	화	수	목	금	토
						1
2	3	4	5	6	7	8
9	10	11	12	13	14	15
16	17	18	19	20	21	22
23	24	25	26	27	28	29
30	31

글 보관함

창고

티스토리 뷰

[VEE] A Lightweight VMM on Many Core for High Performance Computing

'논문 요약' 카테고리의 다른 글

티스토리툴바