0%

MIT6.828 Lab3

Exercise 1

如同Exercise描述中所说的,类似pages分配的方法一样利用boot_alloc()就好,由于后面有env_init()函数负责结构体的初始化,这个地方不需要进行初始化的操作。

1
2
3
// Make 'envs' point to an array of size 'NENV' of 'struct Env'.
// LAB 3: Your code here.
envs = (struct Env *) boot_alloc(sizeof(struct Env) * NENV);

同时采用相似的方法将其map到UENVS处:

1
2
3
4
5
6
7
8
9
// Map the 'envs' array read-only by the user at linear address UENVS
// (ie. perm = PTE_U | PTE_P).
// Permissions:
// - the new image at UENVS -- kernel R, user R
// - envs itself -- kernel RW, user NONE
// LAB 3: Your code here.
n = ROUNDUP(NENV*sizeof(struct Env), PGSIZE);
for (i = 0; i < n; i+=PGSIZE)
page_insert(kern_pgdir, pa2page(PADDR(envs) + i), (void *)(UENVS + i), PTE_U | PTE_P);

Exericse 2

env_init()

这里的操作实际上就是进行一个envs的初始化,最开始所有的都是空闲状态,将其插入env_free_list就可以了:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
// Mark all environments in 'envs' as free, set their env_ids to 0,
// and insert them into the env_free_list.
// Make sure the environments are in the free list in the same order
// they are in the envs array (i.e., so that the first call to
// env_alloc() returns envs[0]).
//
void
env_init(void)
{
// Set up envs array
// LAB 3: Your code here.
memset(envs, 0, sizeof(struct Env) * NENV);

int i;
env_free_list = envs;
for(i = 1; i < NENV; ++i)
envs[i-1].env_link = envs + i;

// Per-CPU part of the initialization
env_init_percpu();
}

env_setup_vm()

这里可以知道,在UTOP下面应当是空白的,在UTOP上面都是相同的,所以首先对整个page进行清空,之后利用memcpy以kern_pgdir为模板,只需要进行page table的修改就可以了:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
//
// Initialize the kernel virtual memory layout for environment e.
// Allocate a page directory, set e->env_pgdir accordingly,
// and initialize the kernel portion of the new environment's address space.
// Do NOT (yet) map anything into the user portion
// of the environment's virtual address space.
//
// Returns 0 on success, < 0 on error. Errors include:
// -E_NO_MEM if page directory or table could not be allocated.
//
static int
env_setup_vm(struct Env *e)
{
int i;
struct PageInfo *p = NULL;

// Allocate a page for the page directory
if (!(p = page_alloc(ALLOC_ZERO)))
return -E_NO_MEM;

// Now, set e->env_pgdir and initialize the page directory.
//
// Hint:
// - The VA space of all envs is identical above UTOP
// (except at UVPT, which we've set below).
// See inc/memlayout.h for permissions and layout.
// Can you use kern_pgdir as a template? Hint: Yes.
// (Make sure you got the permissions right in Lab 2.)
// - The initial VA below UTOP is empty.
// - You do not need to make any more calls to page_alloc.
// - Note: In general, pp_ref is not maintained for
// physical pages mapped only above UTOP, but env_pgdir
// is an exception -- you need to increment env_pgdir's
// pp_ref for env_free to work correctly.
// - The functions in kern/pmap.h are handy.

// LAB 3: Your code here.
e->env_pgdir = page2kva(p);
++(p->pp_ref);
memset(e->env_pgdir, 0, PGSIZE);
memcpy(e->env_pgdir + PDX(UTOP), kern_pgdir + PDX(UTOP), PGSIZE - (PDX(UTOP)<<2));

// UVPT maps the env's own page table read-only.
// Permissions: kernel R, user R
e->env_pgdir[PDX(UVPT)] = PADDR(e->env_pgdir) | PTE_P | PTE_U;

return 0;
}

region_alloc()

这里可以仿照在pmap.c中多次实现的alloc操作,只不过这里的page是利用page_alloc()得到的:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
//
// Allocate len bytes of physical memory for environment env,
// and map it at virtual address va in the environment's address space.
// Does not zero or otherwise initialize the mapped pages in any way.
// Pages should be writable by user and kernel.
// Panic if any allocation attempt fails.
//
static void
region_alloc(struct Env *e, void *va, size_t len)
{
// LAB 3: Your code here.
// (But only if you need it for load_icode.)
//
// Hint: It is easier to use region_alloc if the caller can pass
// 'va' and 'len' values that are not page-aligned.
// You should round va down, and round (va + len) up.
// (Watch out for corner-cases!)
void *start_va = ROUNDDOWN(va, PGSIZE);
void *end_va = ROUNDUP(va + len, PGSIZE);
void *cur_va;
for(cur_va = start_va; cur_va < end_va; cur_va += PGSIZE)
{
struct PageInfo * pp = page_alloc(0);
if(!pp)
panic("region_alloc: Out of memory!\n");
page_insert(e->env_pgdir, pp, (void *)cur_va, PTE_U | PTE_W);
}
}

load_icode()

先看看bootmain()当中是怎么操作的:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
void
bootmain(void)
{
struct Proghdr *ph, *eph;

// read 1st page off disk
readseg((uint32_t) ELFHDR, SECTSIZE*8, 0);

// is this a valid ELF?
if (ELFHDR->e_magic != ELF_MAGIC)
goto bad;

// load each program segment (ignores ph flags)
ph = (struct Proghdr *) ((uint8_t *) ELFHDR + ELFHDR->e_phoff);
eph = ph + ELFHDR->e_phnum;
for (; ph < eph; ph++)
// p_pa is the load address of this segment (as well
// as the physical address)
readseg(ph->p_pa, ph->p_memsz, ph->p_offset);

// call the entry point from the ELF header
// note: does not return!
((void (*)(void)) (ELFHDR->e_entry))();

bad:
outw(0x8A00, 0x8A00);
outw(0x8A00, 0x8E00);
while (1)
/* do nothing */;
}

仿照第13-19行进行下面的code:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
//
// Set up the initial program binary, stack, and processor flags
// for a user process.
// This function is ONLY called during kernel initialization,
// before running the first user-mode environment.
//
// This function loads all loadable segments from the ELF binary image
// into the environment's user memory, starting at the appropriate
// virtual addresses indicated in the ELF program header.
// At the same time it clears to zero any portions of these segments
// that are marked in the program header as being mapped
// but not actually present in the ELF file - i.e., the program's bss section.
//
// All this is very similar to what our boot loader does, except the boot
// loader also needs to read the code from disk. Take a look at
// boot/main.c to get ideas.
//
// Finally, this function maps one page for the program's initial stack.
//
// load_icode panics if it encounters problems.
// - How might load_icode fail? What might be wrong with the given input?
//
static void
load_icode(struct Env *e, uint8_t *binary)
{
// Hints:
// Load each program segment into virtual memory
// at the address specified in the ELF segment header.
// You should only load segments with ph->p_type == ELF_PROG_LOAD.
// Each segment's virtual address can be found in ph->p_va
// and its size in memory can be found in ph->p_memsz.
// The ph->p_filesz bytes from the ELF binary, starting at
// 'binary + ph->p_offset', should be copied to virtual address
// ph->p_va. Any remaining memory bytes should be cleared to zero.
// (The ELF header should have ph->p_filesz <= ph->p_memsz.)
// Use functions from the previous lab to allocate and map pages.
//
// All page protection bits should be user read/write for now.
// ELF segments are not necessarily page-aligned, but you can
// assume for this function that no two segments will touch
// the same virtual page.
//
// You may find a function like region_alloc useful.
//
// Loading the segments is much simpler if you can move data
// directly into the virtual addresses stored in the ELF binary.
// So which page directory should be in force during
// this function?
//
// You must also do something with the program's entry point,
// to make sure that the environment starts executing there.
// What? (See env_run() and env_pop_tf() below.)

// LAB 3: Your code here.
struct Elf * ELFHDR = (struct Elf *)binary;
struct Proghdr * ph, * eph;
lcr3(PADDR(e->env_pgdir));
ph = (struct Proghdr *)(binary + ELFHDR->e_phoff);
eph = ph + ELFHDR->e_phnum;
for(; ph < eph; ++ph)
{
if(ph->p_type == ELF_PROG_LOAD)
{
region_alloc(e, (void *)ph->p_va, ph->p_memsz);
memset((void *)ph->p_va, 0, ph->p_memsz);
memcpy((void *)ph->p_va, binary + ph->p_offset, ph->p_filesz);
}
}
e->env_tf.tf_eip = ELFHDR->e_entry;

lcr3(PADDR(kern_pgdir));
// Now map one page for the program's initial stack
// at virtual address USTACKTOP - PGSIZE.

// LAB 3: Your code here.
region_alloc(e, (void *)(USTACKTOP - PGSIZE), PGSIZE);
}

env_create()

直接调用env_alloc()load_icode()就可以了,可以看做是在上面的一层封装:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
//
// Allocates a new env with env_alloc, loads the named elf
// binary into it with load_icode, and sets its env_type.
// This function is ONLY called during kernel initialization,
// before running the first user-mode environment.
// The new env's parent ID is set to 0.
//
void
env_create(uint8_t *binary, enum EnvType type)
{
// LAB 3: Your code here.
struct Env * e;
if(env_alloc(&e, 0))
panic("env_create: env alloc failed!\n");
load_icode(e, binary);
e->env_type = type;
}

env_run()

要完成的是进行一个上下文的切换,这里主要做的就是首先对于env需要进行状态的改变,之后需要进行地址空间的切换。同时利用已经存在的env_pop_tf()函数来进行寄存器的恢复,具体的代码如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
//
// Context switch from curenv to env e.
// Note: if this is the first call to env_run, curenv is NULL.
//
// This function does not return.
//
void
env_run(struct Env *e)
{
// Step 1: If this is a context switch (a new environment is running):
// 1. Set the current environment (if any) back to
// ENV_RUNNABLE if it is ENV_RUNNING (think about
// what other states it can be in),
// 2. Set 'curenv' to the new environment,
// 3. Set its status to ENV_RUNNING,
// 4. Update its 'env_runs' counter,
// 5. Use lcr3() to switch to its address space.
// Step 2: Use env_pop_tf() to restore the environment's
// registers and drop into user mode in the
// environment.

// Hint: This function loads the new environment's state from
// e->env_tf. Go back through the code you wrote above
// and make sure you have set the relevant parts of
// e->env_tf to sensible values.

// LAB 3: Your code here.

if(curenv != NULL && curenv->env_status == ENV_RUNNING)
curenv->env_status = ENV_RUNNABLE;
curenv = e;
e->env_status = ENV_RUNNING;
++(e->env_runs);
lcr3(PADDR(e->env_pgdir));

env_pop_tf(&(e->env_tf));

//panic("env_run not yet implemented");
}

gdb对hello进行断点调试

在obj/user/hello.asm里面可以看到

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
void
sys_cputs(const char *s, size_t len)
{
800a1c: 55 push %ebp
800a1d: 89 e5 mov %esp,%ebp
800a1f: 57 push %edi
800a20: 56 push %esi
800a21: 53 push %ebx
//
// The last clause tells the assembler that this can
// potentially change the condition codes and arbitrary
// memory locations.

asm volatile("int %1\n"
800a22: b8 00 00 00 00 mov $0x0,%eax
800a27: 8b 4d 0c mov 0xc(%ebp),%ecx
800a2a: 8b 55 08 mov 0x8(%ebp),%edx
800a2d: 89 c3 mov %eax,%ebx
800a2f: 89 c7 mov %eax,%edi
800a31: 89 c6 mov %eax,%esi
800a33: cd 30 int $0x30

对应的地址为0x800a33,利用gdb进行断点设置:

1
2
3
4
5
6
7
(gdb) b *0x800a33
Breakpoint 2 at 0x800a33
(gdb) c
Continuing.
=> 0x800a33: int $0x30

Breakpoint 2, 0x00800a33 in ?? ()

发现确实执行到了这一条指令,以上的实现应该是没有问题。

Exercise 3

内容为阅读Chapter 9,是关于Exceptions和Interrupts的内容。

Exercise 4

从inc/trap.h当中可以发现,TrapFrame有着如下的结构:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
struct Trapframe {
struct PushRegs tf_regs;
uint16_t tf_es;
uint16_t tf_padding1;
uint16_t tf_ds;
uint16_t tf_padding2;
uint32_t tf_trapno;
/* below here defined by x86 hardware */
uint32_t tf_err;
uintptr_t tf_eip;
uint16_t tf_cs;
uint16_t tf_padding3;
uint32_t tf_eflags;
/* below here only when crossing rings, such as from user to kernel */
uintptr_t tf_esp;
uint16_t tf_ss;
uint16_t tf_padding4;
} __attribute__((packed));

可以知道对于剩下的就是要保存%es和%ds,来使得最终结构为一个Trapframe,剩下的按照Exercise的描述操作就可以了,得到_alltraps的结构如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
/*
* Lab 3: Your code here for _alltraps
*/
.global _alltraps
_alltraps:
pushl %ds
pushl %es
pushal
pushl $GD_KD
popl %ds
pushl $GD_KD
popl %es
pushl %esp
call trap

利用已经存在的TRAPHANDLERTRAPHANDLER_NOEC宏可以来生成handler的入口,只需要区分有没有错误码就可以了:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
/*
* Lab 3: Your code here for generating entry points for the different traps.
*/

TRAPHANDLER_NOEC(handler_divide, T_DIVIDE)
TRAPHANDLER_NOEC(handler_debug, T_DEBUG)
TRAPHANDLER_NOEC(handler_nmi, T_NMI)
TRAPHANDLER_NOEC(handler_brkpt, T_BRKPT)
TRAPHANDLER_NOEC(handler_oflow, T_OFLOW)
TRAPHANDLER_NOEC(handler_bound, T_BOUND)
TRAPHANDLER_NOEC(handler_illop, T_ILLOP)
TRAPHANDLER_NOEC(handler_device, T_DEVICE)
TRAPHANDLER(handler_dblflt, T_DBLFLT)
TRAPHANDLER(handler_tss, T_TSS)
TRAPHANDLER(handler_segnp, T_SEGNP)
TRAPHANDLER(handler_stack, T_STACK)
TRAPHANDLER(handler_gpflt, T_GPFLT)
TRAPHANDLER(handler_pgflt, T_PGFLT)
TRAPHANDLER_NOEC(handler_fperr, T_FPERR)
TRAPHANDLER(handler_align, T_ALIGN)
TRAPHANDLER_NOEC(handler_mchk, T_MCHK)
TRAPHANDLER_NOEC(handler_simderr, T_SIMDERR)
TRAPHANDLER_NOEC(handler_syscall, T_SYSCALL)
TRAPHANDLER_NOEC(handler_default, T_DEFAULT)

通过查询80386手册的9.10可以看到如下关于error code的总结:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
Description                       Interrupt     Error Code
Number

Divide error 0 No
Debug exceptions 1 No
Breakpoint 3 No
Overflow 4 No
Bounds check 5 No
Invalid opcode 6 No
Coprocessor not available 7 No
System error 8 Yes (always 0)
Coprocessor Segment Overrun 9 No
Invalid TSS 10 Yes
Segment not present 11 Yes
Stack exception 12 Yes
General protection fault 13 Yes
Page fault 14 Yes
Coprocessor error 16 No
Two-byte SW interrupt 0-255 No

在inc/mmu.h当中可以看到有关SETGATE的描述:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
// Set up a normal interrupt/trap gate descriptor.
// - istrap: 1 for a trap (= exception) gate, 0 for an interrupt gate.
// see section 9.6.1.3 of the i386 reference: "The difference between
// an interrupt gate and a trap gate is in the effect on IF (the
// interrupt-enable flag). An interrupt that vectors through an
// interrupt gate resets IF, thereby preventing other interrupts from
// interfering with the current interrupt handler. A subsequent IRET
// instruction restores IF to the value in the EFLAGS image on the
// stack. An interrupt through a trap gate does not change IF."
// - sel: Code segment selector for interrupt/trap handler
// - off: Offset in code segment for interrupt/trap handler
// - dpl: Descriptor Privilege Level -
// the privilege level required for software to invoke
// this interrupt/trap gate explicitly using an int instruction.
#define SETGATE(gate, istrap, sel, off, dpl) \
{ \
(gate).gd_off_15_0 = (uint32_t) (off) & 0xffff; \
(gate).gd_sel = (sel); \
(gate).gd_args = 0; \
(gate).gd_rsv1 = 0; \
(gate).gd_type = (istrap) ? STS_TG32 : STS_IG32; \
(gate).gd_s = 0; \
(gate).gd_dpl = (dpl); \
(gate).gd_p = 1; \
(gate).gd_off_31_16 = (uint32_t) (off) >> 16; \
}

之后再trap_init()当中进行这样的填充,要注意到断点和系统调用的dpl需要设置为3(用户):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
// LAB 3: Your code here.
void handler_divide();
SETGATE(idt[T_DIVIDE], 0, GD_KT, handler_divide, 0);
void handler_debug();
SETGATE(idt[T_DEBUG], 0, GD_KT, handler_debug, 0);
void handler_nmi();
SETGATE(idt[T_NMI], 0, GD_KT, handler_nmi, 0);
void handler_brkpt();
SETGATE(idt[T_BRKPT], 0, GD_KT, handler_brkpt, 3);
void handler_oflow();
SETGATE(idt[T_OFLOW], 0, GD_KT, handler_oflow, 0);
void handler_bound();
SETGATE(idt[T_BOUND], 0, GD_KT, handler_bound, 0);
void handler_illop();
SETGATE(idt[T_ILLOP], 0, GD_KT, handler_illop, 0);
void handler_device();
SETGATE(idt[T_DEVICE], 0, GD_KT, handler_device, 0);
void handler_dblflt();
SETGATE(idt[T_DBLFLT], 0, GD_KT, handler_dblflt, 0);
void handler_tss();
SETGATE(idt[T_TSS], 0, GD_KT, handler_tss, 0);
void handler_segnp();
SETGATE(idt[T_SEGNP], 0, GD_KT, handler_segnp, 0);
void handler_stack();
SETGATE(idt[T_STACK], 0, GD_KT, handler_stack, 0);
void handler_gpflt();
SETGATE(idt[T_GPFLT], 0, GD_KT, handler_gpflt, 0);
void handler_pgflt();
SETGATE(idt[T_PGFLT], 0, GD_KT, handler_pgflt, 0);
void handler_fperr();
SETGATE(idt[T_FPERR], 0, GD_KT, handler_fperr, 0);
void handler_align();
SETGATE(idt[T_ALIGN], 0, GD_KT, handler_align, 0);
void handler_mchk();
SETGATE(idt[T_MCHK], 0, GD_KT, handler_mchk, 0);
void handler_simderr();
SETGATE(idt[T_SIMDERR], 0, GD_KT, handler_simderr, 0);
void handler_syscall();
SETGATE(idt[T_SYSCALL], 1, GD_KT, handler_syscall, 3);
void handler_default();
SETGATE(idt[T_DEFAULT], 0, GD_KT, handler_default, 0);

利用make grade可以得到下面的输出:

1
2
3
4
5
6
7
divzero: OK (1.8s)
(Old jos.out.divzero failure log removed)
softint: OK (1.7s)
(Old jos.out.softint failure log removed)
badsegment: OK (2.1s)
(Old jos.out.badsegment failure log removed)
Part A score: 30/30

说明这里Part A的实现没有问题。

Question

  1. 如果所有的exception/interrupt都通过同样一个handler,那么就没有办法知道是通过哪一个中断进来的,不能设置对应的中断号,后面不能进行分发。

  2. 除了系统调用门,其他的特权级都设置成0,这里int $14本来应当触发page fault,但是这个时候权限不对,所以会触发general protection fault。如果允许他能够触发page fault的话,那么者会造成安全隐患。

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    [00000000] new env 00001000
    Incoming TRAP frame at 0xefffffbc
    TRAP frame at 0xf0226000
    edi 0x00000000
    esi 0x00000000
    ebp 0xeebfdfd0
    oesp 0xefffffdc
    ebx 0x00000000
    edx 0x00000000
    ecx 0x00000000
    eax 0x00000000
    es 0x----0023
    ds 0x----0023
    trap 0x0000000d General Protection
    err 0x00000072
    eip 0x00800037
    cs 0x----001b
    flag 0x00000046
    esp 0xeebfdfd0
    ss 0x----0023
    [00001000] free env 00001000

    当允许触发page fault的时候,可以看到保存的内容如下

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    [00000000] new env 00001000
    Incoming TRAP frame at 0xefffffc0
    TRAP frame at 0xefffffc0
    edi 0x00000000
    esi 0x00000000
    ebp 0xeebfdfd0
    oesp 0xefffffe0
    ebx 0x00000000
    edx 0x00000000
    ecx 0x00000000
    eax 0x00000000
    es 0x----0023
    ds 0x----0023
    trap 0x0000000e Page Fault
    cr2 0x00000000
    err 0x00800039 [kernel, read, protection]
    eip 0x0000001b
    cs 0x----0046
    flag 0xeebfdfd0
    esp 0x00000023
    ss 0x----ff53
    [00001000] free env 00001000

    在这里我的想法是第15行往后全部进行了一位的上移,可以看到之后的err当中的内容实际上应该是eip,eip的内容实际上是cs,以此类推。应该是cr2或者err code没有进行压入导致的。

Exercise 5

只需要在trap_dispatch()当中添加一个分支即可:

1
2
3
4
5
if(tf->tf_trapno == T_PGFLT)
{
page_fault_handler(tf);
return;
}

Exercise 6

和Exercise 5相同,只需要在trap_dispatch()里面添加一个分支:

1
2
3
4
5
if(tf->tf_trapno == T_BRKPT)
{
monitor(tf);
return;
}

Question

  1. 在于前面Exercise 4中的设置:

    1
    SETGATE(idt[T_BRKPT], 0, GD_KT, handler_brkpt, 3);

    这里当最后的dpl设置为3的时候,会正确的触发为break point exception,当设置为0的时候,会触发为general protection fault。其原因在于,如果设置为0,会导致断点触发需要内核级的权限,因为权限不够从而触发GPF。

  2. 这个测试的目的主要是检查权限是否设置正确,需要正确的区分用户和内核,防止用户对于内核代码进行操作产生安全隐患。

Exercise 7

在kern/trap.c里面,同之前两个Exercise一样进行分发的设置:

1
2
3
4
5
6
7
if(tf->tf_trapno == T_SYSCALL)
{
tf->tf_regs.reg_eax = syscall(tf->tf_regs.reg_eax,
tf->tf_regs.reg_edx, tf->tf_regs.reg_ecx, tf->tf_regs.reg_ebx,
tf->tf_regs.reg_edi, tf->tf_regs.reg_esi);
return;
}

在kern/syscall.c当中,利用switch进行分发即可,注意不同系统调用的参数就可以了:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
// Dispatches to the correct kernel function, passing the arguments.
int32_t
syscall(uint32_t syscallno, uint32_t a1, uint32_t a2, uint32_t a3, uint32_t a4, uint32_t a5)
{
// Call the function corresponding to the 'syscallno' parameter.
// Return any appropriate return value.
// LAB 3: Your code here.

//panic("syscall not implemented");

switch (syscallno) {
case SYS_cputs:
sys_cputs((const char *)a1, (size_t)a2);
return 0;

case SYS_cgetc:
return sys_cgetc();

case SYS_getenvid:
return sys_getenvid();

case SYS_env_destroy:
return sys_env_destroy((envid_t)a1);

case NSYSCALLS:
return 0;

default:
return -E_INVAL;
}
}

Exercise 8

在lib/libmain.c当中,进行env_id的指定:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
void
libmain(int argc, char **argv)
{
// set thisenv to point at our Env structure in envs[].
// LAB 3: Your code here.
envid_t envid = sys_getenvid();
thisenv = &envs[ENVX(envid)];

// save the name of the program so that panic() can use it
if (argc > 0)
binaryname = argv[0];

// call user main routine
umain(argc, argv);

// exit gracefully
exit();
}

Exercise 9

在kern/trap.c当中的page_fault_handler()函数当中,利用tf_cs来判断是不是在kernel-mode,如果是直接触发一个panic:

1
2
3
4
// Handle kernel-mode page faults.
// LAB 3: Your code here.
if(((tf->tf_cs)&3) == 0)
panic("page fault: happen in kernel mode! %08x\n", tf->tf_cs);

在kern/pmap.c当中,采用一个for循环对虚拟地址区间进行权限的检查,具体内容遵循注释就可以。当检查没有问题的时候返回值为0,否则返回值为-E_FAULT。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
//
// Check that an environment is allowed to access the range of memory
// [va, va+len) with permissions 'perm | PTE_P'.
// Normally 'perm' will contain PTE_U at least, but this is not required.
// 'va' and 'len' need not be page-aligned; you must test every page that
// contains any of that range. You will test either 'len/PGSIZE',
// 'len/PGSIZE + 1', or 'len/PGSIZE + 2' pages.
//
// A user program can access a virtual address if (1) the address is below
// ULIM, and (2) the page table gives it permission. These are exactly
// the tests you should implement here.
//
// If there is an error, set the 'user_mem_check_addr' variable to the first
// erroneous virtual address.
//
// Returns 0 if the user program can access this range of addresses,
// and -E_FAULT otherwise.
//
int
user_mem_check(struct Env *env, const void *va, size_t len, int perm)
{
// LAB 3: Your code here.
int newperm = perm | PTE_P;
uint32_t cur_addr;
pte_t * pte;
for(cur_addr = (uint32_t)va; cur_addr < (uint32_t)(va + len); cur_addr = ROUNDDOWN((cur_addr+PGSIZE),PGSIZE))
{
if(cur_addr >= ULIM)
{
user_mem_check_addr = cur_addr;
return -E_FAULT;
}
pte = pgdir_walk(env->env_pgdir, (void *)cur_addr, 0);
if((!pte) || ((*pte) & newperm) != newperm){
user_mem_check_addr = cur_addr;
return -E_FAULT;
}
}
return 0;
}

需要注意的是要在kern/syscall.c当中需要填充上有关检查的部分!

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
// Print a string to the system console.
// The string is exactly 'len' characters long.
// Destroys the environment on memory errors.
static void
sys_cputs(const char *s, size_t len)
{
// Check that the user has permission to read memory [s, s+len).
// Destroy the environment if not.

// LAB 3: Your code here.
user_mem_assert(curenv, s, len, PTE_W);

// Print the string supplied by the user.
cprintf("%.*s", len, s);
}

如果这里没有进行user_mem_assert()的话,执行buggyhello会进入系统调用然后在内核态触发page fault。

之后为backtrace相关的内容,在kern/kdebug.c当中添加有关usd,stabs,stabstr的检查如下,这里注意user_mem_check()当正常的时候返回值为0:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
else {
// The user-application linker script, user/user.ld,
// puts information about the application's stabs (equivalent
// to __STAB_BEGIN__, __STAB_END__, __STABSTR_BEGIN__, and
// __STABSTR_END__) in a structure located at virtual address
// USTABDATA.
const struct UserStabData *usd = (const struct UserStabData *) USTABDATA;

// Make sure this memory is valid.
// Return -1 if it is not. Hint: Call user_mem_check.
// LAB 3: Your code here.
if(user_mem_check(curenv, (void *)usd, sizeof(struct UserStabData), PTE_U))
return -1;

stabs = usd->stabs;
stab_end = usd->stab_end;
stabstr = usd->stabstr;
stabstr_end = usd->stabstr_end;

// Make sure the STABS and string table memory is valid.
// LAB 3: Your code here.
if(user_mem_check(curenv, (void *)stabs, (uint32_t)stabs-(uint32_t)stab_end, PTE_U))
return -1;
if(user_mem_check(curenv, (void *)stabstr, (uint32_t)stabstr_end-(uint32_t)stabstr, PTE_U))
return -1;
}

在执行breakpoint之后,利用backtrace得到的结果如下:

1
2
3
4
5
6
7
8
9
10
11
12
K> backtrace
Stack backtrace:
ebp efffff10 eip f01010d6 args 00000001 efffff28 f0228000 00000000 f01e6a40
kern/monitor.c:448: monitor+260
ebp efffff80 eip f01048ca args f0228000 efffffbc 00000000 00000000 00000000
kern/trap.c:195: trap+180
ebp efffffb0 eip f0104a43 args efffffbc 00000000 00000000 eebfdfd0 efffffdc
kern/trapentry.S:85: <unknown>+0
ebp eebfdfd0 eip 0080007b args 00000000 00000000 00000000 00000000 00000000
lib/libmain.c:27: libmain+63
ebp eebfdff0 eip 00800031 args 00000000 00000000Incoming TRAP frame at 0xeffffec
kernel panic at kern/trap.c:270: page fault: happen in kernel mode! 00000008

可以看到最后为lib/libmain.c,并且最终在内核态发生了page fault。可以发现,这个地方args在输出到第三个参数的时候突然触发,那应该是从ebp向上读取args触发的page fault。

结合mom_backtrace()的实现如下,应该是在13行的语句处出现的错误:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
int
mon_backtrace(int argc, char **argv, struct Trapframe *tf)
{
cprintf("Stack backtrace:\n");
uint32_t* ebp = (uint32_t*)read_ebp();
struct Eipdebuginfo info;
while(ebp){
cprintf("ebp %08x ",ebp);
cprintf("eip %08x ",ebp[1]);
cprintf("args");
int i;
for(i=2;i<=6;++i)
cprintf(" %08x",ebp[i]);
cprintf("\n");

Exercise 10

运行evilhello可以看到如下的输出:

1
2
3
4
5
6
7
[00000000] new env 00001000
Incoming TRAP frame at 0xefffffbc
Incoming TRAP frame at 0xefffffbc
[00001000] user_mem_check assertion failure for va f010000c
[00001000] free env 00001000
Destroyed the only environment - nothing more to do!
Welcome to the JOS kernel monitor!

用户环境被销毁了,并且kernel没有panic,说明行为符合预期。

使用make grade命令可以得到如下结果:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
divzero: OK (1.4s)
softint: OK (1.4s)
badsegment: OK (2.0s)
Part A score: 30/30

faultread: OK (1.9s)
faultreadkernel: OK (1.6s)
faultwrite: OK (0.9s)
faultwritekernel: OK (1.6s)
breakpoint: OK (2.0s)
testbss: OK (2.1s)
hello: OK (1.8s)
buggyhello: OK (1.7s)
buggyhello2: OK (0.8s)
evilhello: OK (1.6s)
Part B score: 50/50

Score: 80/80

说明lab3的内容已经被完成了。

Challenge 1

参考了github上https://github.com/SimpCosm/6.828/tree/master/lab3的实现。

其中TRAPHANDLER和TRAPHANDLER_NOEC的主要区别就在于有没有压入error code,这里采用一个if语句来进行判断:

1
2
3
4
5
6
7
8
9
10
11
12
13
#define GENERALHANDLER(name, num)	\
.data; \
.long name; \
.text; \
.globl name; \
.type name, @function; \
.align 2; \
name: \
.if !(num == 8 || (num >= 10 && num <= 14) || num == 17 ); \
pushl $0; \
.endif; \
pushl $(num); \
jmp _alltraps

之后构建一个数组vectors用来保存相关函数,就可以采用脚本语言批量生成重复代码:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
.data
.globl vectors
vectors:
GENERALHANDLER(handler0, 0)
GENERALHANDLER(handler1, 1)
GENERALHANDLER(handler2, 2)
GENERALHANDLER(handler3, 3)
GENERALHANDLER(handler4, 4)
GENERALHANDLER(handler5, 5)
GENERALHANDLER(handler6, 6)
GENERALHANDLER(handler7, 7)
GENERALHANDLER(handler8, 8)
GENERALHANDLER(handler9, 9)
GENERALHANDLER(handler10, 10)
GENERALHANDLER(handler11, 11)
GENERALHANDLER(handler12, 12)
GENERALHANDLER(handler13, 13)
GENERALHANDLER(handler14, 14)
GENERALHANDLER(handler15, 15)
GENERALHANDLER(handler16, 16)
GENERALHANDLER(handler17, 17)
GENERALHANDLER(handler18, 18)
GENERALHANDLER(handler19, 19)
GENERALHANDLER(handler20, 20)
GENERALHANDLER(handler21, 21)
GENERALHANDLER(handler22, 22)
GENERALHANDLER(handler23, 23)
GENERALHANDLER(handler24, 24)
GENERALHANDLER(handler25, 25)
GENERALHANDLER(handler26, 26)
GENERALHANDLER(handler27, 27)
GENERALHANDLER(handler28, 28)
GENERALHANDLER(handler29, 29)
GENERALHANDLER(handler30, 30)
GENERALHANDLER(handler31, 31)
GENERALHANDLER(handler32, 32)
GENERALHANDLER(handler33, 33)
GENERALHANDLER(handler34, 34)
GENERALHANDLER(handler35, 35)
GENERALHANDLER(handler36, 36)
GENERALHANDLER(handler37, 37)
GENERALHANDLER(handler38, 38)
GENERALHANDLER(handler39, 39)
GENERALHANDLER(handler40, 40)
GENERALHANDLER(handler41, 41)
GENERALHANDLER(handler42, 42)
GENERALHANDLER(handler43, 43)
GENERALHANDLER(handler44, 44)
GENERALHANDLER(handler45, 45)
GENERALHANDLER(handler46, 46)
GENERALHANDLER(handler47, 47)
GENERALHANDLER(handler48, 48)
GENERALHANDLER(handler49, 49)
GENERALHANDLER(handler50, 50)
GENERALHANDLER(handler51, 51)
GENERALHANDLER(handler52, 52)
GENERALHANDLER(handler53, 53)

之后就可以在kern/trap.c中对trap_init()采用循环构建入口,节省大量代码,对于特殊的可以单独提出来进行构造:

1
2
3
4
5
int i;
for(i = 0; i < 54; ++i)
SETGATE(idt[i], 0, GD_KT, vectors[i], 0);
SETGATE(idt[T_BRKPT], 0, GD_KT, vectors[T_BRKPT], 3);
SETGATE(idt[T_SYSCALL], 1, GD_KT, vectors[T_SYSCALL], 3);

在完成以上的修改之后,通过make grade仍然可以得到80分结果,说明没有问题:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
divzero: OK (1.4s)
softint: OK (1.4s)
badsegment: OK (1.6s)
Part A score: 30/30

faultread: OK (0.9s)
faultreadkernel: OK (1.5s)
faultwrite: OK (2.0s)
faultwritekernel: OK (1.6s)
breakpoint: OK (0.9s)
(Old jos.out.breakpoint failure log removed)
testbss: OK (1.5s)
hello: OK (1.6s)
buggyhello: OK (1.0s)
buggyhello2: OK (1.4s)
evilhello: OK (1.6s)
Part B score: 50/50

Score: 80/80

Challenge 2

Intel手册中12.3.1.4节为关于单步调试的相关内容:

This debug condition occurs at the end of an instruction if the trap flag (TF) of the flags register held the value one at the beginning of that instruction. Note that the exception does not occur at the end of an instruction that sets TF. For example, if POPF is used to set TF, a single-step trap does not occur until after the instruction that follows POPF.

意思就是设置了TF之后,执行完下一条命令会触发一个DEBUG。于是可以照如下写continue和stepi指令:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
int
mon_continue(int argc, char **argv, struct Trapframe *tf)
{
if(!tf)
return 0;
tf->tf_eflags &= ~FL_TF;
return -1;
}

int
mon_stepi(int argc, char **argv, struct Trapframe *tf)
{
if(!tf)
return 0;
tf->tf_eflags |= FL_TF;
return -1;
}

这里需要注意第六行,如果在continue里面不进行eflags维护单纯返回的话,会导致在执行了stepi指令之后,continue指令无效的情况。

同时为了使得stepi触发DEBUG之后能够回到monitor,需要在trap_dispatch()当中添加关于T_DEBUG的分发:

1
2
3
4
5
if(tf->tf_trapno == T_BRKPT || tf->tf_trapno == T_DEBUG)
{
monitor(tf);
return;
}

此时利用continue可以在断点程序之后继续执行,效果如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
[00000000] new env 00001000
Incoming TRAP frame at 0xefffffbc
Incoming TRAP frame at 0xefffffbc
Welcome to the JOS kernel monitor!
Type 'help' for a list of commands.
TRAP frame at 0xf0228000
edi 0x00000000
esi 0x00000000
ebp 0xeebfdfd0
oesp 0xefffffdc
ebx 0x00000000
edx 0x00000000
ecx 0x00000000
eax 0xeec00000
es 0x----0023
ds 0x----0023
trap 0x00000003 Breakpoint
err 0x00000000
eip 0x00800038
cs 0x----001b
flag 0x00000046
esp 0xeebfdfd0
ss 0x----0023
K> continue
Incoming TRAP frame at 0xefffffbc
[00001000] exiting gracefully
[00001000] free env 00001000
Destroyed the only environment - nothing more to do!
Welcome to the JOS kernel monitor!
Type 'help' for a list of commands.
K>

Challenge 3

从lab中所给的链接可以找到rdmsrwrmsr的宏定义:

1
2
3
4
5
6
7
8
9
#define rdmsr(msr,val1,val2) \
__asm__ __volatile__("rdmsr" \
: "=a" (val1), "=d" (val2) \
: "c" (msr))

#define wrmsr(msr,val1,val2) \
__asm__ __volatile__("wrmsr" \
: /* no outputs */ \
: "c" (msr), "a" (val1), "d" (val2))

从IA32的手册当中可以找到在使用SYSENTER之前所需要设置的相关内容:

  • IA32_SYSENTER_CS (MSR address 174H) — The lower 16 bits of this MSR are the segment selector for the privilege level 0 code segment. This value is also used to determine the segment selector of the privilege level 0 stack segment (see the Operation section). This value cannot indicate a null selector.
  • IA32_SYSENTER_EIP (MSR address 176H) — The value of this MSR is loaded into RIP (thus, this value references the first instruction of the selected operating procedure or routine). In protected mode, only bits 31:0 are loaded.
  • IA32_SYSENTER_ESP (MSR address 175H) — The value of this MSR is loaded into RSP (thus, this value contains the stack pointer for the privilege level 0 stack). This value cannot represent a non-canonical address. In protected mode, only bits 31:0 are loaded.

添加一个sysenter_init()并且在trap_init()内进行调用来实现初始化:

1
2
3
4
5
6
7
8
void
sysenter_init(void)
{
wrmsr(0x174, GD_KT, 0);
wrmsr(0x176, syscall_fast, 0);
wrmsr(0x175, KSTACKTOP, 0);
return;
}

在lib/syscall.c里面需要仿照syscall写一个syscall_fast,与syscall不同的是,这里采用sysenter而不是int 0x30。同时需要将%esi保存为sysenter之后的位置,并且push和pop保存%ebp。这里参数同syscall相似,只是少了最后一个a5。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
static inline int32_t
syscall_fast(int num, int check, uint32_t a1, uint32_t a2, uint32_t a3, uint32_t a4)
{
int32_t ret;
asm volatile(
"leal .after_sysenter_label, %%esi\n"
"push %%ebp\n"
"movl %%esp, %%ebp\n"
"sysenter\n"
".after_sysenter_label: popl %%ebp\n"
: "=a" (ret)
: "a" (num),
"d" (a1),
"c" (a2),
"b" (a3),
"D" (a4)
:);

if(check && ret > 0)
panic("syscall %d returned %d (> 0)", num, ret);

return ret;
}

在kern/syscall.c中要写一个handler用来处理对应的系统调用,这个就是之前在init里面所填充的入口,流程为从保存的寄存器中取得参数,执行相应的内容,结束之后将返回值保存并利用sysexit返回(这里如果参数不都采用"=m"的约束会出现bug,原因未知):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
void
syscall_fast(void)
{
uint32_t syscallno, a1, a2, a3, a4, ret;
uint32_t eip, esp;

asm volatile(
"mov %%eax, %0\n"
"mov %%edx, %1\n"
"mov %%ecx, %2\n"
"mov %%ebx, %3\n"
"mov %%edi, %4\n"
"mov %%esi, %5\n"
"mov (%%ebp), %6\n"
:"=r" (syscallno),
"=m" (a1),
"=m" (a2),
"=m" (a3),
"=m" (a4),
"=r" (eip),
"=r" (esp)
);

switch (syscallno) {
case SYS_cputs:
sys_cputs((const char *)a1, (size_t)a2);
ret = 0;
break;

case SYS_cgetc:
ret = sys_cgetc();
break;

case SYS_getenvid:
ret = sys_getenvid();
break;

case SYS_env_destroy:
ret = sys_env_destroy((envid_t)a1);
break;

case NSYSCALLS:
ret = 0;
break;

default:
panic("syscall_fast: wrong syscallno\n");
}

asm volatile(
"sysexit\n"
:
: "a" (ret),
"d" (eip),
"c" (esp)
:);
}

在lib/syscall.c中修改sys_cputs()让其调用syscall_fast()进行测试(实际上就是丢掉最后一个传入的参数就可以了):

1
2
3
4
5
6
void
sys_cputs(const char *s, size_t len)
{
syscall_fast(SYS_cputs, 0, (uint32_t)s, len, 0, 0);
//syscall(SYS_cputs, 0, (uint32_t)s, len, 0, 0, 0);
}

执行hello能够得到的输出如下:

1
2
3
4
5
6
7
8
[00000000] new env 00001000
Incoming TRAP frame at 0xefffffbc
hello, world
i am environment 00001000
Incoming TRAP frame at 0xefffffbc
[00001000] exiting gracefully
[00001000] free env 00001000
Destroyed the only environment - nothing more to do!

对比原来的输出:

1
2
3
4
5
6
7
8
9
10
[00000000] new env 00001000
Incoming TRAP frame at 0xefffffbc
Incoming TRAP frame at 0xefffffbc
hello, world
Incoming TRAP frame at 0xefffffbc
i am environment 00001000
Incoming TRAP frame at 0xefffffbc
[00001000] exiting gracefully
[00001000] free env 00001000
Destroyed the only environment - nothing more to do!

可以发现由于在进行系统调用的时候没有采用int 0x30,所以这里在每次输出前并没有都进入trap()函数,使得少去了两行Incoming TRAP frame at ....的输出。

替换后使用make grade也能够得到满分80分,至少在这个lab中采用sysenter不会有问题。